Comparison of Support Vector Machine-Based Equalizer and Code-Aided Expectation Maximization on Fiber Optic Nonlinearity Compensation Using a Proposed BER Normalized by Power and Distance Index

A R T I C L E I N F O A B S T R A C T Article history: Received: 10 September, 2020 Accepted: 07 October, 2020 Online: 24 November, 2020 Advances in optimizing optical fiber communications have been on the rise these recent years due to the increasing demand for larger data bandwidths and overall better efficiency. Coherent optics have focused on many kinds of research due to its ability to transport greater amounts of information, have better flexibility in network implementations, and support different baud rates and modulation techniques. These result in fiber-optic lines to provide faster speeds to end-users. Recent literature has looked into further developing digital signal processing techniques, while others have focused on fiber material optimization. Machine learning is another area of research that has garnered traction due to such demands. This survey discusses support vector machine (SVM) and code-aided expectation-maximization (CAEM) techniques on how they compensate for nonlinearity in coherent fiber optical communications. The study mainly focuses on how these techniques impact the performance of the transmissions where they are implemented and how they compensate for fiber optic nonlinearity through either the reduction of bit error rates (BERs), the improvements in the quality factor, or through a suggested index based on BER, power, and distance. Collating the results and based on a distinctive index, SVM is preferable in mid-range haul transmissions while CAEM for longer hauls.


Introduction
With the growing need to manage, transmit, process, and receive large amounts of data, optical fiber transmissions can find a niche in today's network communication age as it can address several network traffic issues. Compared to other communication methods such as radio wave propagation and other physical transmission media like twisted pair and coaxial cables, optical fiber transmission can deliver more data, operate more efficiently, occupy less space while having more capacity, and be less susceptible to interceptions. However, despite these advantages, optical communications have its fair share of disadvantages such as cost, complexity, and perhaps the most faced issue is phase sensitivity [1]. Over recent years, coherent optic fiber communications coupled with newly discovered digital signal processing techniques have improved and optimized data transmissions. One such stride was the shifting from single carrier multiplexing to coherent optical orthogonal frequency division multiplexing (CO-OFDM), which brought about advantages such as inter-symbol interference mitigation and higher bandwidth efficiency. However, OFDM's serious disadvantage is the higher peak-to-average power ratio (PAPR) that comes with it, which results in a phenomenon known as fiber optic nonlinear distortion [2]. Digital modulation techniques that are usually paired with OFDM, such as Amplitude shift keying (ASK) and Phase shift keying (PSK), and Quadrature amplitude modulation (QAM), are greatly affected by nonlinear distortions as these can significantly increase the BER of the system. Numerous studies and innovations such as those in [3]- [5] have risen to try to minimize the nonlinearity experienced in this multiplexing process such as varying or combining the digital signal processing (DSP) techniques applied, optimizing the material that coats the fiber optic core to reduce the birefringence and implementing neural networks in the transmitter/receiver or both to maximize the bandwidth delivered.

ASTESJ ISSN: 2415-6698
Just like any other transmission system, coherent optical fiber transmissions encounter their fair share of performance drops due to several types of induced noise. Amplified spontaneous emission (ASE) from inline amplifiers is one of these major contributors to linear noise. Laser phase noise from transmitters and the local oscillator also put a damper on system performance by a relatively significant margin. However, the largest contributor and perhaps the most significant cause for concern is the nonlinear phase noise (NLPN) caused by an interaction of the signal and the ASE mentioned above through the phenomenon known as fiber Kerr effect or Kerr induced nonlinearity [6]. For transmissions like quadrature amplitude modulation (QAM), which relies heavily on signal amplitude and phase shifting in its transmission, noise can significantly impact its performance, specifically on its BER. A small amount of noise can incorrectly categorize the transmitted data. This error is especially apparent in optical transmissions because most utilizing the orthogonal frequency division multiplexing scheme produces a high peak-to-average power ratio resulting in noise that can increase the BER in a transmission medium. reduced noise [8] Hence, solutions to addressing such nonlinearity are essential. In this work, two important techniques that work to that end analyze SVM and CAEM on how they affect optical fiber transmissions regarding their overall performance and fiber optic nonlinearity compensation. Performance comparison was made between the two using parameters of the different studies. In particular, the emphasis was on the use of coherent optical orthogonal frequency division multiplexing (CO-OFDM) or polarization division multiplexing using a 16-ary QAM (16-QAM) signal with varying fiber lengths and variables. The authors also propose a comparative index to fairly evaluate the two techniques based on the bit error rate (BER), power, and transmission distance. This paper is organized then as follows. Section 1 is followed by discussing the methods and a proposed comparative index in Section 2. Results and their discussions were done in Section 3, and recommendations given in Section 4.

16-QAM Least-Squares SVM Nonlinearity Equalizer
A 16-QAM CO-OFDM coupled with an SVM nonlinear equalizer is proposed in [2]. The optical fiber link is composed of multiple 100 km standard single-mode fiber (SSMF). Attenuation in the link is accounted for and compensated using Erbium-doped fiber amplifiers (EDFA). The ASE contributed by the EDFA is considered as white Gaussian noise. Inputs to the digital modulator are generated from a pseudo-random binary sequence (PRBS) module, which then undergoes the QAM modulation. An inverse fast-Fourier transform module is utilized to convert the time domain signal generated to an equivalent frequency domain. In this case, the pulses of the signal are kept ideal for simplifying the simulation. To maximize linear conversion between the OFDM signal and the optical field, the OFDM signal's in-phase and quadrature segments are used by a pair of Mach-Zehnder modulators (MZM). Both MZM operates via push-pull configuration and is configured to be biased at the minimum transmission point to remove the chirp phenomenon effectively. Once the optical signal reaches the receiver after traversing Nspan amplifiers, it is converted into an electrical signal by a 90⁰ photoreceiver. Noise due to the laser linewidth's imperfections is disregarded as the study aims to isolate fiber nonlinearities due to other noise [2]. The signal then undergoes the normal decryption process before being fed into the machine learning algorithm, after which it is fully demodulated, and the error rate is calculated. Support vector machines are powerful classifying tools yet can only be used as binary classifiers. Due to this restriction, this study's approach combines multiple two-class SVMs for a multiclass model to be used. Since the study uses 16-QAM as its modulation technique, the signal is divided into sixteen (16) individual clusters in a constellation wherein each cluster represents data in a unique binary form [9]. For a single two-class SVM classifier, N pairs of vectors ( , ), k=1,…,N where and yk are the input and output patterns, respectively undergo training to obtain the hyperplane. yk is the labeling function where yk ∈ {1, -1}. Through this training process, possible noise in the constellation is distributed efficiently and accurately. Based on the training data present, SVM aims to construct a classifier f(x) where is the weight, is the support vector, is the bias term, and Φ( ) is the mapping function. The training process determines the weight, support vector, and bias terms used for the constructed classifier. For more complex data to be accurately separated, a mapping function is used to transform the training data xk into a higher dimension. The SVM also utilizes a Kernel trick to help nonlinear decisions in mapping low complexity computations. The approach focused on this survey utilizes a radial basis function kernel in which only dot products are needed. K(xi,kj) ≡ exp (-ỿSVM||xi -xj|| 2 )), ỿSVM > 0, with ỿSVM as the Kernel function.
Since there are more than two unique data sets to classify from in a 16-QAM constellation, the study employed a one versus rest rule, wherein a received data point would classify in a particular cluster if and only if it is accepted by that cluster and is rejected by the rest, if two or more clusters accept the data point then it is considered noise. To further increase the SVM classification accuracy in noisier environments, the study opted to implement a least-square variant of the SVM studied in [11]. Least-Square SVM (LS-SVM) provides a more optimized solution using the following: constrained by where is termed as the slack variable which shows the error term which must satisfy the condition ≥ 0. C is known as the regularization parameter.
After the CD compensation process, both I and Q segments of the signal are fed into the LS-SVM in which the classifier is formed through a two-stage process of training and testing [2], [11].

Training
• Arrange label , in-phase I and quadrature Q to format the SVM packet. • Use cross-validation to determine the optimal C and ỿSVM values. • C and ỿSVM values to train the SVM.

Testing
• Insert testing symbol.
• Compare predicted labels to transmitted symbols to determine and evaluate the BER.

Code-Aided Expectation Maximization
A wavelength-division multiplexing (WDM) and polarization division multiplexing (PolDM) system is considered in [12]. Nine simultaneously transmitting channels with the middle channel being the main focus of the study utilizes 16-QAM modulation with a 32 GBaud symbol rate. The optic signal goes through a 2640 km distance consisting of 33 spans of 80 km each. Dispersion-compensating fiber (DCF) is not utilized in the study; however, the same standard single-mode fiber (SSMF) is used for the fiber cable, and an erbium-doped fiber amplifier (EDFA) is employed to counteract fiber loss. The signal undergoes a chromatic dispersion compensation followed by a polarization demultiplexing at the receiver. A frequency estimation (FE) acts on the sampled signal to estimate the frequency with a margin of accuracy equal to or better than 4 MHz. A Viterbi algorithm compensates for any leftover frequency that utilizes an FIR filter length with phase averaging. Due to noise correlating over time, traditional white noise assumption methods of demodulation are often suboptimal. The study in [12] partially compensates impairments done by both inter-channel and intra-channel nonlinear effects by exploiting the correlation of the phase noise through the use of CAEM. The phase noise correlation is dealt with by using a regularizer in the utility function of the algorithm. The CAEM describes as follows: = ( − 1)log 2 + 1, … , log 2 (8) It is assumed that the residual noise in [ ] patterns itself in a circularly symmetric zero-mean white Gaussian distribution. Constellation points whose labels are = 0 are defined in the set 0 and likewise for 1 .
3. An updated log-likelihood ratio LLR ( ) is obtained by decoding a soft-input-soft-output (SISO) FEC based on the initial LLR( ) . LLR ( ) bits are then converted to probabilities. Assuming a constellation point, ∈ is labeled as a logarithmic bit sequence 1 , 2 , 3,…, 2 we obtain:  (10) where N is the total number of symbols, is the BER optimization weight with ∈ [0,1] The regularizer function ( ) is described as follows: in which 2 is the noise variance that can be empirically computed with the aid of the training data.
5. The vector ̂, 1 is calculated assuming the below conditions -Maximization step: where ̂, is the estimated phase noise between FEC and EM in the th iteration and the estimated noise between the E and M steps in the th iteration. To numerically compute for ̂, 1 a gradientascent method is used.
6. The utility function is recomputed, however instead of using (step 3), the following is used. 7.
8. Steps 1 to 7 are repeated until ̂ converges, and ̂= ̂ being the final phase noise is estimated.

Simulation Parameters
The following tables show the parameters and conditions of each approach.

Comparative Index, Results, and Discussions
The tables below show the results of each study's approach. Since comparison is to be done using the BER, findings with Q-factor results were converted to their equivalent BER using the following [13]: (15) Q(dB) = 10log 10 (Q 2 ) = 20log 10 (Q) where Q is the Q-factor and erfc () is the complementary error function. Since the (erfc) used in calculating BER is approximated, any BER values computed will be approximations. To ensure the calculations' high accuracy, BER values are displayed, having values up to seven decimal places.

Proposed Index
The authors proposed an index (17), which is a measure of how well each technique compares to others depending on the application. The index in Table 8 is calculated as follows:

LS-SVM Results
The BER values in Table 3 are extrapolated from the graphical results in [2], while BER values in Tables 4 and 5 are calculated estimates from the data of [2] using (15) and (16).

CAEM Results
The Q-Factor values in Table 6 are extrapolated from the graphical results in [12]. BER values are calculated estimates using (15) and (16) based on the Q-Factor.  Table 7 shows BER values attained from the different studies under highly similar conditions, thus making it possible to compare their results. LS-SVM can be compared under a -6 dBm launch power at 1200 km fiber length while CAEM has results at 0 dBm launch power with a 2640 km fiber length.  Low index values mean better overall performance for the algorithm. As shown in Table 8, the LS-SVM can be applied well in midrange haul and low complexity applications, whereas CAEM for long haul and high complexity optical networks. The index provided in (17) for measuring and comparing the performance in the studies [12], [2] is thus proposed for nonlinearity compensation performance comparisons. Nonetheless, it also recognized that multiple values should be considered and further simulated to get graphical representations of the results.

Recommendations
Based on the results, LS-SVM provides nonlinearity compensation in a CO-OFDM 16-QAM system, but for longer fiber lengths, CAEM provides a significantly lower BER value. These outcomes make CAEM a preferred choice when it comes to long haul optical transmissions. Complexity wise it is recommended to utilize LS-SVM. CAEM is preferred despite the higher complexity due to its significantly lower index. One potential future work is to verify the results in an experimental setup.