A Computational Modelling and Algorithmic Design Approach of Digital Watermarking in Deep Neural Networks

Article history: Received: 21 October, 2020 Accepted: 12 December, 2020 Online: 25 December, 2020 In this paper we propose an algorithmic approach for Convolutional Neural Network (CNN) for digital watermarking which outperforms the existing frequency domain techniques in all aspects including security along with the criteria in the neural networks such as conditions embedded, and types of watermarking attack. This research addresses digital watermarking in deep neural networks and with comprehensive experiments through computational modeling and algorithm design, we examine the performance of the built system to demonstrate the potential of watermarking neural networks. The inability of intruder towards the retrieval of data without the knowledge of architecture and keys is also discussed and results of the proposed method are compared with the state of the art methods at different noises and attacks.


Introduction
The research presented here is the extension of the work originally presented at International Conference on Artificial Intelligence and Signal Processing (AISP),2020 [1]. The digital revolution and the internet have paved a way to the creation of massive digital information containing images, videos, transactions, intellectual properties. The ease with which this digital data can be copied and reproduced has created avenues for copyright infringements. The massive explosion of digital multimedia devices has resulted in the creation of a large chunk of data and increased demand (and role) of data hiding techniques. Digital watermarking is employed for various applications such as copyright protection (ownership assertion), broadcast monitoring (really broadcasted or not), tamper detection (persistent item identification, forgery detection), data authentication and verification (integrity verification), fingerprinting (transaction tracking and privacy control ), content description (labelling and captioning), publication monitoring and copy control (unauthorized distribution) covert communication (data hiding), Medical applications (Annotation and privacy control), and Legacy system enhancement (backward compatibility) [2]. All information hiding techniques revolve around 3 parameters: Imperceptibility, robustness and payload capacity [3]. While payload capacity is important in steganography, in digital watermarking the trade-off between imperceptibility and robustness is needed for the excellent quality of data hiding [4]. The ability of a data hiding technique to remain unchanged to human perception is called imperceptibility i.e. an unintended user should not be able to make out whether the image has undergone watermarking [5]. Robustness is the ability of watermarked data to be immune to attacks and threats. Other goals of digital watermarking are Security (watermark should be secret and undetectable by an unauthorized parties), effectiveness (ease of detection immediately after embedding), uniqueness (multiple watermarks to coexist), cost effectiveness (in terms of hardware and computational speed), and scalability [6]. Several Artificial Intelligent techniques such as neural network, evolutionary computation, fuzzy logic, swarm intelligence, Probabilistic reasoning and multi-agent systems were proposed to secure watermark in digital media [7]. These techniques will be employed at either the transmitter side during embedding the watermark or during the extraction stage at receiver. Recent approaches have also attempted to incorporate AI methods in the pre-processing stage [8].
With such diverse approaches and their unparalleled capabilities, the hiding of data in the digital image will be far more superior in terms of imperceptibility and robustness as compared to the existing ones [9].  A typical digital watermarking mechanism with embedding and extracting stages is depicted in Figure 1. It enables owners of the digital documents to embedded their copyright information for information security.
Digital watermarking can be done on text, image, audio, video and graphics in spatial or frequency domain. The watermark can be of noise type (pseudo noise, Gaussian random and chaotic sequences) or image type (binary image, stamp, logo and label) [10]. Based on the deployment conditions various watermarking techniques can be used. For public use, visible watermarks are preferred while for private applications and to arrest unauthorized copying invisible watermarking can be used [11]. Fragile watermarks are used in tamper-proof applications whereas robust watermarking is used in applications where the watermark should remain intact even after modification or tampered with [12]. According to the detection stage, visual watermarking is more robust which needs the original media and the embedded watermark for detection while blind watermarking does not require any of these. This is the most demanding type of watermarking as the watermarking is generated and embedded at the transmitter, while detection and extraction will happen at receiver as illustrated in Figure 2. Watermarking is also done even in prepossessing stage and the approach presented here does not impact the efficiency of the network in which a watermark is inserted as the embedded watermark while it is training the host network [13,14].
The robustness of hard and soft decision detectors can be measured using Receiver Operating Characteristic graphs while Bit Error Rate is used for the detector response with bit sequence.

Literature Review
Many standard quantitative measures and metrics have been proposed to evaluate digital watermarking while comparing with their counterparts. Here we report all the standard methods found in the literature as illustrated in Table 1 and those used in this work to measure imperceptibility and robustness (along with capacity and computational cost) [34]. We use the methods for performance evaluation by ensuring  [15] Discrete Wavelet Transform Robustness [16] Multiwavelet Transform Imperceptibility [17] Discrete Wavelet Transform Robustness [18] Fourier Transform Fidelity [19] Discrete wavelet Transform Imperceptibility [20] Discrete wavelet Transform Imperceptibility Frequency Domain + Radial Basis NN [21] Discrete Cosine Transform Robustness [22] Discrete wavelet Transform Imperceptible [23] Discrete wavelet Transform Invisible and Robust [24] Discrete wavelet Transform Robustness Frequency Domain + Hopfield NN [25] Capacity [26] Imperceptibility [27] Image Quality [28] Discrete Cosine Transform Invisible and Robust Frequency Domain + Full Counter Propagation NN [29] Robustness, Imperceptibility [30] Discrete Cosine Transform Robustness [31] Discrete Cosine Transform Complexity, Capacity, PSNR [32] Discrete Cosine Transform Imperceptibility, and robustness Frequency Domain + Synergetic NN [33] Discrete Wavelet Transform Robustness and Imperceptibility i) model and sources of distortion remain uniform ii) All test images are 8-bit grey-scale images and are defined in same color space.
Quality assessment can also be done by comparing the original watermark and extracted watermark. This alternate metric is called a Normalized Correlation which exploits the correlation between the original watermark and the extracted one. The value lies between [0 1], any value nearer to 1 assures better quality. Another way of defining the similarity between the original watermark and the extracted one is accuracy ratio. It is the ratio of correct bits to the total bits. The architecture of CNN is different, unlike neural network where all layers are fully connected, here the layers are recognized in 3D: height, width, and depth. Further neurons in one layer are connected to only a small region of the next layer. Finally, the output is a single vector of probability scores, organized along the depth dimension [35]. The CNN consists of series of convolutional, pooling/subsampling layers followed by a fully connected layer. To implement the act of recognition in machines we need to show an algorithm of millions of images before it makes a pattern by generalizing the input and start making predictions for images it has never seen before [36]. The aim is to evolve a more robust and imperceptible watermarking scheme that can cater to the needs of content protection [37].

Algorithm Design
Watermarking can be achieved in any of the following two methods: either by changing the pixel values (least significant) of the image or by changing the coefficient values [38]. The quality of watermark depends on the method used. The first method where bits (representing the pixel values) are manipulated refers to a spatial domain which is very simple but not robust and can be easily perceived. In simple terms, spatial domain techniques refer to replacing pixels of the original image by watermark image [39]. In a given image first, the target pixels are identified and are replaced by pixels of watermark image. The spatial domain techniques are simple, fast and with less computational complexity [40]. They are immune to cropping and noising but are sensitive to signal processing attacks. In pursuit of enhancing robustness, if more pixels are manipulated we may end up with visible watermarks. These algorithms should carefully achieve a trade-off between robustness and imperceptibility [41]. In this section the experimental setup of watermarking a digital image is described with Algorithm 1 and Algorithm 2. At the various stages of digital watermarking from encoding of original image to decoding process and finally up to extracting the watermark and obtaining the high-resolution output decoded image. The aim is to evolve a more robust and imperceptible watermarking scheme that can cater to the needs of content protection and piracy prevention [42]. All the transform (and few hybrid approaches with NN) methods are good for watermarking but lack learning and adaptability [43]. We propose a digital watermarking method using deep learning methods which exploit the expressiveness of deep NN to securely embed invisible, imperceptible, attackresilient binary signatures into the cover images. Coming to the decoder, we adopt adversarial model techniques to cause disorders to decode the desired signature [44]. We perform extended gradient descent under the Expectation over Transformation framework. In training the decoder network an Expectation-Maximization (EM) framework is employed to learn feature transformations that are more resilient to the attacks [45]. Experimental results indicate that our model achieves robustness across different transformations (all transformations, including scaling, rotation, adding noise, blurring, random cropping, and more) [46]. The aim is to evolve a more robust and imperceptible watermarking scheme that can cater to the needs of content protection and piracy prevention [47]. All the transform (and few hybrid approaches with NN) methods are good for watermarking but lack learning and adaptability [48]. We propose a digital watermarking method using deep learning methods which exploit the expressiveness of deep NN to securely embed an invisible, imperceptible, attack resilient binary signature into the cover images [49]. Coming to the decoder, we adopt adversarial model techniques to cause disorders to decode the desired signature. We perform extended gradient descent under the Expectation over Transformation framework. In training the decoder network an Expectation-Maximization (EM) framework is employed to learn feature transformations that are more resilient to the attacks. To implement the act of recognition in machines we need to show an algorithm of millions of images before it makes a pattern by generalizing the input and start making predictions for images it has never seen before [50]. The architecture of CNN is different, unlike NN where all layers are fully connected, here the layers are recognized in 3D: height, width, and depth. Further neurons in one layer are connected to only a small region of the next layer. Finally, the output is a single vector of probability scores, organized along the depth dimension. The CNN consists of series of convolutional, pooling/sub-sampling layers followed by a fully connected layer. The Figure 3 elucidates the block diagram representation of a typical digital watermarking process with Figure 3(a) as Watermark embedding process and Figure 3  The Algorithm 1 illustrates the encoding process of digital watermarking using deep neural network and the Algorithm 2 presents the decoding process of digital watermarking using deep neural network.

Performance Evaluation
Fair performance evaluation of any system is fundamental for its acceptance and accreditation. Many standard quantitative measures and metrics have been proposed to evaluate digital watermarking while comparing with their counterparts. Here we report all the standard methods found in the literature and those used in this work to measure imperceptibility and robustness (along with capacity and computational cost). We use the methods for performance evaluation by ensuring i) model and sources of distortion remain uniform ii) all test images are 8-bit gray-scale images and are defined in same color space. For an image size of M x N pixels, with a pixel value of O for original image (without watermark) and W for watermarked image, the performance metrics can be calculated as below:

Robustness Metrics
Robustness Metrics: Robustness of a watermark is a measure of resistance to attacks. The detectors dictate the evaluation method to measure robustness. The three types of detector responses are i) hard decisions (true or false for the presence and absence of watermark respectively), ii) soft decisions (correlation or similarity coefficients in terms of real numbers) iii) bit sequence (if the embedded watermark is in form of a message). The robustness of hard and soft decision detectors can be measured using Receiver Operating Characteristic (ROC) graphs while Bit Error Rate (BER) is used for the detector response with bit sequence.

ROC:
True Positive Fraction (TPF): + False Positive Fraction (FPF): + Quality assessment can also be done by comparing the original watermark and extracted watermark. This alternate metric is called a Normalized Correlation which exploits the correlation between the original watermark and the extracted one. The value lies between [0 1], any value nearer to 1 assures better quality. Another way of defining the similarity between the original watermark and the extracted one is accuracy ratio. It is the ratio of correct bits to the total bits. When the attack is not going to affect the commercial value then it is better to consider only the watermark-to-noise ratio. This ratio signifies the power of watermark signal against the noise introduced by such attacks. For a watermark of size m x n pixels, the performance metrics can be calculated as below: where, δ (X,Y) = 1 if X=Y and 0 otherwise;

Results Obtained
In this section we discuss the watermarking in the presence of various attacks at different noise levels. We present the results in terms of robustness of the watermark (BER and NC). The CNN is trained using standard descent back propagation algorithm. The performance consistency of the proposed method is verified by considering 2 different cover images: Lena and Camera man as presented in Figure 4 shows the watermarking process with different transforms (scaling, rotation, adding noise, blurring, random cropping, and more)and also confirms that that the proposed method is applicable to any cover image and with any watermark. BER is a measure of noise injected when the signal is received after transmission channel. The BER shows the signal loss and fading in a wireless channel. The BER for various noise levels is shown in Figure 5(a). Normalized (cross) correlation is a template-matching method in digital watermarking. The template will be an image that shows a critical feature; by repeatedly computing a statistic between the watermarked image and corresponding pixels of a subset of an original image presents the noise correlation for various noise levels of 2,5,10 and 15. The important parameter in watermarking is the loss of original information and the accuracy with which the watermark is hidden in the cover image. There is a fine balance between the robustness and imperceptibility. The trade-off is to achieve high robustness without showing any trace of watermark. The accuracy of our model is gradually increasing as the epochs increases, on the contrary loss is gradually decreasing. The models training and testing results are shown in Figure 6 with the over fit, under fit and good fit are shown in Figures 6(a), 6(b) and 6(c) respectively. Our model Adam performs well and is consistent as compared to other models as shown in Figure 7. The Figure 8 shows the noise injected/present and the model performance. In the Figure 8(a) noise in hidden layer is shown while Figure 8(b) depicts the noise in input layer.   The graph shows that training and test results are in good accordance and validate the model used for training and optimization in digital watermarking. The proposed model is suitable for video too as the training is done off-line hence can be used in real-time applications. Hence the use of deep neural networks will enable us to realize complex features like learning weight sharing and update mechanism, noise-resilience and immunity towards attacks, and scaling. The proposed method has low loss and the accuracy goes on increasing as the number of epochs increases. The accuracy is 85% for 15 epochs and will slightly increase for more epochs. The level of high accuracy shows that the original image and watermarked image are indistinguishable.
This high performance of the model can be owed to the learning feature embedded in the watermarking. A digital watermarking task comprises embedding a signal into an image in accordance with robustness and quality constraints, it can be said that it is in essence a multi-objective optimization problem. The faster convergence of the algorithm (more accuracy with less epochs) can be achieved by introducing reinforcement learning or transfer learning method which has got huge attention in recent times. The current digital watermarking can also be validated by applying it to protect the copyrights of trained neural networks where ownership protection and piracy prevention is of utmost priority. The traditional approaches in digital watermarking are not fit for the digital data that can be stored efficiently and with very high quality and manipulated easily using computers [59,60]. The aim is to evolve a secure digital communication that remarkably pushes forward the limits of legacy digital watermarking schemes across all dimensions of performance metrics. As this research space is ever increasing and innovations have spurred at all levels of communication, considerable progress is required in understanding the deep learning approaches for digital watermarking. The convergence of technological, economic and environmental forces is driving the digital watermarking and deep learning simultaneously, then each drives the other forward. Be it the deep neural networks driving digital watermarking or vice-versa, the continued expansion of each is good for the other. The complete suitability of digital watermarking for securing deep neural networks dataset is yet to be conducted. The trained models can be viewed as intellectual property, and it is a worthy challenge to provide copyright protection for trained models. We emphasis on how the copyrights of trained models can be protected computationally and propose for neural networks a digital watermarking technology. We propose a conceptual framework for integrating a watermark into models of deep neural networks to safeguard copyrights and identify violation of trained models of intellectual property. The Table 2 illustrates the comparison of obtained results with the previous works and it is noteworthy that the proposed work has shown an improvement of 5.75% in PSNR without noise and 6.68% in PSNR with noise = 5.

Conclusions
In this paper, we proposed a learning framework for robust digital image watermarking technique based on deep neural network. As observed, previous efforts in this space focused on optimization of embedding parameters with use of evolutionary computing. Demonstration of the watermarking under various noise and attacks is performed. The detailed experiments were carried out and we analyzed the performance of designed system. We have shown that our model could embed a watermark without impairing a deep neural network's efficiency. In future the research would continue in the direction towards digital watermarking based on intelligence.