HUST bearing: a practical dataset for ball bearing fault diagnosis

Objectives The rapid growth of machine learning methods has led to an increase in the demand for data. For bearing fault diagnosis, the data acquisition is time-consuming with complicated processes. Existing datasets are only focused on only one type of bearing, which limits real-world applications. Therefore, the objective of this work is to propose a diverse dataset for ball bearing fault diagnosis based on vibration. Data description In this work, we introduce a practical dataset named HUST bearing, which provides a large set of vibration data on different ball bearings. This dataset contains 99 raw vibration signals of 6 types of defects (inner crack, outer crack, ball crack, and their 2-combinations) on 5 types of bearing (6204, 6205, 6206, 6207, and 6208) at 3 working conditions (0 W, 200 W, and 400 W). Each vibration signal is sampled at a rate of 51,200 samples per second for 10 s. The data acquisition system is elaborately designed with high reliability.


Introduction
Electric motors are an indispensable component in manufacturing plants.They often have to work continuously without stopping, subjected to a variety of heavy loads and operate in harsh environments.Therefore, bearings in electric motors become vulnerable components.Statistics show that electric motor faults related to bearings account for nearly 50% of the total number of common faults [1].The consequences when a bearing fault occurs are severe, including the destruction of mechanical structures, production stoppage, and costly repair [2].For these reasons, the early detection and diagnosis of bearing faults is an urgent task and plays a crucial role in electrical machine health monitoring.
When a bearing defect appears, it generates abnormal signals such as vibrations, sound waves, temperature, etc. [3].In particular, the vibration signal often contains useful information, so it is widely used to detect and diagnose bearing faults [4].Vibration signal-based methods are usually approached in three approaches, i.e., physical model-based, signal processing, and machine learning-based methods [5].Physical model-based approaches often require a deep understanding of machine mechanics and operating principles, making it challenging to build high-precision physical systems.Signal processing-based approaches use bearing fault characteristics for detection and diagnosis.However, this method is difficult in the early stages of the fault, where the characteristics of the fault are not clearly expressed.Another group of approaches is machine learning, which focuses on analyzing and learning from data.Machine learning methods have shown superiority over conventional methods when giving results with high accuracy, especially thanks to deep learning models [6].Typical machine learning methods include support vector machine, principal component analysis, artificial neural network.
The rapid growth of machine learning methods has led to an increase in the demand for data.These data are collected from defective bearings to support the learning process.To acquire these data, reality shows that it takes time with difficult and complicated processes [7].Data acquisition tasks include defect generation on bearings, measurement system setup, and data recording.Fortunately, many open-access datasets have been published for scientists and engineers to research and develop.Some of the most popular bearing fault data sets are CWRU [8], Paderborn [9], IMS [10], and Pronostia [11].The CWRU dataset uses a test bench with a 2 HP inductor motor along with a torque transducer and a dynamometer.Vibration data were captured on drive-end and fan-end bearings of types 6205 and 6203 with 5 man-made single faults, ranging in size from 7 mils to 40mils.The motor's operating conditions consist of various loads from 0 to 3 HP (speed from 1720 to 1797) with sampling frequencies of 12 kHz and 48 kHz.Paderborn dataset includes synchronous signals of vibration and stator current.Both the vibration signal and the current signal were measured at a high sample rate with 26 defective and 6 undefective bearings.Of the 26 defective bearings, 14 are realistic faults caused by the bearing life acceleration method, the remaining 12 are artificial faults.This dataset increases the reliability of diagnostic models because it uses natural faults generated from the degradation of bearing and lubricant.Unlike other datasets, the IMS dataset generates realistic faults by accelerating bearing life.Specifically, the ZA-2115 bearings were run continuously for 30 days with a heavy load of 6000 lbs to cause defects.The vibration signal is measured simultaneously with the bearing temperature which uses a thermocouple to monitor lubricant deterioration.The experiment is repeated several times until all three types of defects appear in the inner race, the outer race, and the ball.Data is received intermittently every 5 or 10 minutes at a frequency of 20 kHz for 1 second.Another popular dataset is Pronostia, which also generates faults using the accelerated life test.This dataset provides realistic data regarding the accelerated deterioration of bearings under a variety of operating conditions.The data collected also includes vibration and temperature signals recorded at a high frequency of 25.6 kHz.However, the common point of the mentioned above data sets is to focus on only one type of bearing and this is the limitation for the in-real applications which are described after.Figure 1 illustrates the data collection system of the mentioned four datasets and Table 1 shows a comparison between them.Approaches using published data sets can establish intelligent models that predict bearing faults but rely on learned data.However, data in published datasets and data collected in practice may have different distributions, which are influenced by various complex factors such as bearing dimensions, operation conditions, measurement conditions, and the influence of the external environment [12].Therefore, intelligent diagnostic models based on published datasets cannot perform well when used directly for real data [12].To solve this problem, many methods using domain adaptation have been proposed to reduce the distribution differences between data sets [13][14][15][16][17]. Obviously, these methods are more effective in cases where the distributions do not differ too much i.e., it is magnificent if there are suitable reference datasets to choose from.In

Defects generation
In general, there are two ways to create bearing defects: artificial and natural.However, the natural fault generation process is relatively complex, so the bearing defects in our dataset are artificially generated.Our object is electric motors operating at low loads, so the bearings used to create defects consist of id 6204, 6205, 6206, 6207, and 6208.As for the fault types, from our observation, when a bearing appears with one defect, it easily leads to another defect on another component due to mechanical interaction.Therefore, our data set includes common single faults such as inner race fault, outer race fault, and ball fault.In addition, double faults are also incorporated as inner and outer faults, inner and ball faults, and outer and ball faults (see Figure 2).We use the wire cutting method to create a 0.2 mm micro-crack width, simulating the initial state of the fault.We only created one early-stage fault size because we believe that early diagnosing of faults is much more important than diagnosing the extent of the fault so that 0.2 mm fault is enough for us.Descriptions for the bearing dimensions and ratio between fault frequencies and shaft frequency (fs) of fault types generated for our dataset are shown in Table 2.

Test bench setup
The basic layout of the test bench is demonstrated in Figure 3.The test bench consists of a 750 W (1 HP) induction motor driving a multi-step shaft and a powder brake of Leroy Somer.The multi-step shaft means the shaft with multiple step changes in diameter.The motor is controlled by an inverter and the powder brake plays as a simulated load.Furthermore, a torque transducer and a dynamometer are also mounted to the shaft in order to monitor the load and velocity of the motor.Defective bearings are mounted to different types of housings and these housings can be flexibly replaced on the multi-step shaft.On the bearing, an accelerometer of PCB 325C33 is installed in the vertical direction with the object measuring vibration.Note that within the group of outer race faults, due to vibration amplitude being affected by the outer fault position, the accelerometer sensor is installed in both vertical and horizontal directions.The description of all 120 data sets is shown in Table 3.It includes vibration data of 30 prototype defective bearings at three load conditions of 0 HP, 200 W, and 400 W. In addition, the introduced dataset provides 30 vibration data of the defective bearings at a run-up time of 5 seconds.The vibration during run-up may early alert the adverse condition of the bearings, especially for high-power electric motors which consume starting time.It is worth noting that all data are collected with a high sample rate of 51.2 kHz in which these data can record signal changes in detail.3. Data analysis

Envelope analysis
Envelope analysis is a fundamental and well-known procedure for bearing fault diagnostic in steady state [18].Because frequency analysis on the raw signal does not provide useful diagnostic information, envelope analysis is the successful alternative method for bearing fault detection [19].Therefore, in this examination, the bearing defect diagnosis is performed based on envelope analysis of the vibration signals in steady state i.e., the shaft speed is constant.An envelope spectrum is obtained after the analysis procedure.The acquired vibration signals are verified whether the fault frequencies appear in the envelope spectrum [18].Furthermore, the detection and the formation of the defect frequencies provide the first estimation of the attitude of signals and corresponding incident state.
The envelope of a vibration signal can be computed in different ways i.e., there are a few types of envelope signals.In this examination, Hilbert envelope, which is a popular envelope type, is used.In the envelope analysis using Hilbert envelope, there are three main steps.Given the original signal s(t), in the first step, the analytic signal is computed from the original signal by Hilbert transform.The Hilbert transform h(t) = H{s(t)} is defined as [20]: Therefore, the Hilbert transform can be referred to as a 90 o phase shifter and it results in the analytic signal.The analytic signal of s(t) is determined as: In the second step, the envelope signal is obtained by calculating the magnitude or the absolute value of the analytic signal as shown in (3): Finally, in the third step, Fast Fourier transform (FFT) is taken to the envelope signal to obtain the envelope spectrum.In this step, the fault information is retrieved from the amplitudes at the fault frequencies in the spectrum.

Table 4: Bearing fault frequencies
When rolling elements pass the defects in the races, it causes typical kinematic or fault frequencies which are observed in the envelope spectrum apparently.A similar phenomenon occurs when defective rolling elements spin themselves.Therefore, the detection of fault frequencies in the envelope spectrum is equivalent to fault detection.Theoretically, the fault frequency formulas calculated according to the geometrical parameters of the bearing are shown in Table 4.In Table 4, N is the number of balls, F is the shaft frequency, B is the ball diameter, P is the mean of inner and outer diameter, and  is the contact angle.

Order tracking and Hilbert-Huang transform
In the transient state e.g., the run-up or coast-down of the electric motor, the shaft speed is unstable i.e., the envelope analysis cannot be performed directly.Instead of that, order tracking is an effective approach for bearing fault detection under varying speed conditions [21].The objective of order tracking is transforming a measured signal from time domain to angular or order domain.These techniques are applied to asynchronously sampled signals (with a constant sample) to obtain the same signal sampled at constant angle increments of a reference shaft.
Hence, the instantaneous shaft phase must be known as a prior.Fortunately, in the HUST bearing dataset, the speed profile of the shaft for each vibration data are provided i.e., the instantaneous shaft phase can be calculated.
In the past, three main families of computed order tracking techniques have been developed: computed order tracking [22], Vold-Kalman filter [23] and order tracking transforms [24].
Among these techniques, the computed order tracking (COT) resampling technique is simple and based on interpolation so that it is widely used in practical.In this examination, the computed order tracking technique is applied to vibration signals during run-up to detect bearing faults.In details, given an original signal x(t) and its speed profile v(t), given a desired angular resolution Δφ, the resampled signal using COT is computed as follows: (i) compute the phase angle of the shaft; (ii) obtain the vector of time instants; (iii) obtain the angular resampled signal.
Though the angular resampled signal has periodic impulses, the amplitudes of these impulses are non-periodic.Hence, envelope analysis cannot be applied directly to extract fault characteristic frequencies.It is clear that at each fault impulse, a modulation signal is generated  The fundamental part of the HHT is the empirical mode decomposition (EMD) method.Using the EMD method, any complicated data set can be decomposed into a finite and often small number of components.These components form a complete and nearly orthogonal basis for the original signal.In addition, they can be described as intrinsic mode functions (IMFs) [27].The IMFs amplitude and frequency can vary with time and it must satisfy two rules: (i) the number of extremes and the number of zero-crossings must either equal or differ at most by one; (ii) the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is near zero at any point.Figure 5 demonstrates the sifting process of EMD to extract IMFs from a given signal s(t).Then, the original signal s(t) can be expressed as shown in (7) with ci(t) is an IMF and r(t) is the residue.
With the obtained IMFs, Hilbert transform is applied to each IMF component ci(t) to compute the corresponding analytic signal ai(t) using (1)(2).Thus, the original signal s(t) can be expressed as the real part in the following form: The resulting angular resampled signals and the corresponding HHT spectrums of the examined bearing are demonstrated in Figure 6.

Classification performance
In this section, the most common machine learning algorithms are performed on our dataset to predict the type of faults.The workflow of the application of machine learning algorithms includes: extracting features from data, training classification model and evaluating results.vibration signal and another.The RMS value presents the power of a vibration signal so it is useful for detecting imbalance in rotating machinery.The most prevalent method to measuring defects in the time domain is to apply the RMS approach, but it is unable to work when the problem is in early stage [28].Another measure is crest factor [29] which can be an effective index in detecting faults in outer race of bearings because of its sensitivity with impulsive vibration source.In addition, skewness and kurtosis [30] are other advanced statistical-based features that can be applied to the signal which is not purely stationary.The overview of the statistical time-domain features is provided in Table 5.In Table 5, N denotes the number of samples, xi denotes the i-th sample, μ and σ denote the expectation and standard deviation respectively.In this work, we prefer the constant Q transform (CQT) [33] which is a variation of WT.Due to the fact that the length of the sample array (buffer size) used to perform the transform varies with frequency, it is beneficial to represent a signal in low frequencies and address the issue of mapping frequency on a logarithmic scale by raising the buffer size for lower frequencies.Due to the smaller buffer size for high frequencies, that technique also reduces computing overhead.
Low-frequency components of collected vibration signals are more informative concerning bearing health than high-frequency ones because of noise.CQT is dependent on the following key elements: s: (i) window functions gk , which are real-valued, even functions.In the frequency domain, the Fourier transform of gk is defined in the interval [−Fs/2, Fs/2]; (ii) the sampling rate ωs ; (iii) the number of bins per octave, b; (iv) the minimum and maximum frequencies, ωmin and ωmax, respectively.

Classification models ❖ Decision tree
One of the most prevalent techniques for supervised machine learning is decision tree which is used to solve regression and classification issues.In this approach, all the dataset is labeled, and data is divided into the nodes.The tree structure is formed in form of flow chart where each node relies on each attribute, child links refer to the output of aforementioned process and leaves represent the final decision (class label).The purpose of decision tree technique is to construct a model that can be utilized to forecast the target variable by learning decision rules.Every tree has the root node, where the inputs go through and then it is divided into subsets of decision nodes where contain the results after splitting.The nodes do not split into further was called terminal node or leaf node.Major advantage of this approach is the capability of handling numerical as well as categorical attributes and the time required for computation is less.
Eventually, this method shows the efficiency for small datasets but can cause lagging for large datasets.

❖ K-nearest neighbor (KNN)
KNN is a non-parametric lazy machine learning algorithm meaning that there are not any assumptions on the underlying data distribution [34].K is a constant value which is defined by the user, and it will find out all the similar existing features with the new case.The principle of this technique is to classify the neighbors's votes using Euclidean distance between points of data, through categorizing the object to the most popular class between its k nearest neighbors.
There are still some limitations of KNN such as the high computation cost due to the requirement of measuring the distance of all samples and the extensive training dataset could cause the poor performance.

❖ Support vector machine (SVM)
The purpose of SVM is to attempt to find out the best decision boundary hyperplane among classes by determing the number of points on the class's edge.From the separation of ndimensional space, the dataset is divided into accurate categories.The distance between classes is defined as the margin which can affect the accuracy of the classification, i.e: the higher margin means the better accuracy.The data points on the border (hyperplane) are known as support vectors.This technique could handle both linear and non-linear dataset due to using various kernel types like linear radial basis function [35], polynomial, and sigmoid for model.
Actually, it is able to solve small dataset problem effectively but performing less efficiency for large dataset.Figure below illustrates an instance of the SVM algorithm.

❖ Artificial neural network (ANN)
ANN is a computational network originated from biological theory of neurons in the brain.
Similar to a human brain comprises numerous neurons interconnected to each other; ANN also provides neurons (nodes) that are linked in various layers of the network.The main elements are input layers, hidden layers, and output layers.
As the name suggests, the input layer contains different format input data, and the output layer represents the result.The hidden layers are used to extract the features of the input data and send them to output layers.The determination size of hidden layer is the complex task because the poor performance will appear in case of underestimation of number of hidden layers while overestimation may result the overfitting matter.The computational principle of ANN is the transfer function as follows: Where   is the weighted matrix,   performs the node's value in the  ℎ layer and b denotes the bias coefficient.
The weighted sum and bias are calculated from the input data, then it is passed through hidden layers and being as the input of the proper activation function to generate the desired output.

❖ Convolution neural network (CNN)
CNN is currently one of the most prevalent deep learning approach used in various applications.
This method is almost similar to ANN technique that can be seen as acyclic chart with the collection of neurons in well-arranged form.There are several dimensional layers convolved along with the input of layer in each section of CNN.The fundamental architecture of CNN consists of convolutional layers, pooling layers and fully connected layers as drawn in following chart (Figure 7).

Classification results
From the bearing vibration data files, we generate data segments of length 2048 with 75% overlap.The length of one segment corresponds to about one revolution of the motor shaft.
Then, from these segments, we randomly select 17,500 segments corresponding to 7 fault statements of 5 types of bearings (each fault has 2500 segments) and divide them into training sets (80%) and test set (20%).Bearing fault types are classified as: N, I, O, B, IO, IB, and OB.
Since the amount of data between classes is the same, the metrics to evaluate classification efficiency are common ones: overall accuracy, precision, recall and f1-score.In addition, to observe classification accuracy by class, confusion matrix is also considered.

❖ Time domain
Firstly, the machine learning algorithms are trained with the vibration features extracted in the time domain including: rms, variance, skewness, crest factor and curtorsis.These features are computed from the segments in the training set, generating feature vectors of length 5, as training data for the models.The proposed models to evaluate the classifier ability of our dataset are kNN, DT, SVM, NB and ANN.In which, the training configuration for these models is set by default similar to the classification learner application in Matlab.The mean classification results for the models are shown in Table 7.By our observations, the classification results for features in the time domain are relatively low.
Specifically, the accuracy of the classification models are all below 60%, with the largest value belonging to the kNN algorithm (57.03%).The remaining models for accuracy range from 41.3% to 50.83%.This proves that the features extracted in the time domain are informative enough to be classified.However, these features are not really effective for the problem of bearing fault diagnosis (due to the lack of frequency information).This is also a common feature of time-domain feature-based bearing fault classification methods.Similarly, the features in the frequency domain are also computed to generate feature vectors of length 5.The components of the feature vector include: mean frequency (MF), frequency center (FC), root mean square frequency (RMSF), standard deviation frequency (STDF) and root variance frequency (RVF).The classification results of the bearing faults in the frequency domain are shown in Table 8.In general, the overall accuracy of the feature classification in the frequency domain is significantly higher than that of the time domain.This proves that the information in the frequency domain has the ability to be well differentiated.The highest classification accuracy belongs to the SVM model with 100%.Meanwhile, other models also give results ranging from 81.54% to more than 90%.

❖ Frequency domain
❖ Time-frequency domain The use of time-frequency transformations has been shown to be highly effective in the problem of bearing defect classification [36][37][38][39].Therefore, to evaluate the classification ability of machine learning models based on time-frequency characteristics has been investigated.The CQT is used to generate grayscale spectral images as shown in Figure 8.The size of the spectral image was selected based on the previous works [40] with a resolution of 64x64.For models that require vector input, the spectral image will be flattened into a row vector.The prediction results of Lenet, CNN, kNN, DT and SVM models with spectral image input are shown in Table 8.With the note that the proposed CNN model is an improved version of the Lenet model, consisting of 2 convolutional blocks with feature maps of 16 and 8, using 5x5 kernels, with batch normalization and a LeRU activation function.It can be seen that methods using timefrequency characteristics also achieve the same high accuracy as in the frequency domain.
However, the advantage of these methods is that it can work correctly with test data at different working speeds [41].The Lenet model is still not as efficient as its improved CNN version.The reason is that the original Lenet model does not use batch normalization and the activation function is sigmoid instead of ReLU.From our observations, except for Lenet and DT models, the remaining models have high accuracy of over 91.67%, in which, the SVM model achieves the highest accuracy (97.83%).
Figure 9 is the confusion matrix of the models corresponding to

Unsupervised transfer learning performance
In this section, our dataset is evaluated under unsupervised transfer learning (UTL) tasks.The application of UTL algorithms in bearing fault diagnosis are more popular nowadays because of its ability on projecting the fault type of newly collected test data.Our dataset consists of five bearing models which are relatives, under the similar working conditions and collecting conditions.It is promised that the fault types of a specific bearing can be well predicted by using training data of other bearings in our dataset.

Background
In UTL, it is assumed that there are two set of data call source domain and target domain.The source domain is represented for labeled data which is obtained from available datasets (published before or from laboratories) whereas the target domain contains unlabeled data from locomotive bearings.The UTL algorithms are proposed to expect that knowledge learned from source domain can be used to diagnose fault in target domain.However, this work is often fraught with difficulties due to the difference between distributions of source and target domain, which is influenced by fault type, bearing type and working/data collecting conditions [42].

Unsupervised transfer learning results
Obviously, UTL models will often not provide as high accuracy as supervised learning models.
Therefore, in order to improve the efficiency of the UTL methods, we arrange the experiments described as follows.First, we define the set of labels used in the experiments, which are single faults: normal (N), inner fault (I), outer fault (O) and ball fault (B), which helps to limit errors when the model has to recognize too many unlabeled errors.Our idea is to split the dataset into two domains, the source domain of labeled data to train the UTL model and the target domain of unlabeled test data.For the target domain, each target domain is the data of one specific bearing, we choose four target domains for four different tasks, respectively: 6205, 6206, 6207 and 6208 (bearing 6204 was rejected because of insufficient labels).Then, the source domain for each destination domain will be determined by the data of the remaining bearings.For example, for the target domain of bearing data 6208 (referred to task A-8), the source domain is the data of bearings 6204, 6205, 6206 and 6207.Task details and information related are listed in Table 10.
Another point of note in the UTL experiments is that the raw data is divided into chunks of length 4096 with no overlap to create input data for learning models, different from 2048 with 75% overlap as in the classification experiments above.This is explained to distinguish the inner ring fault from the ball fault because they have close fault frequencies.In addition, we also perform UTL methods on input data that are signals in the frequency domain (they are simply fast Fourier transforms of segments of 4096 points in the time domain).The metric to evaluate the quality of UTL methods is overall accuracy, which is the ratio of the number of correct labels and the total number of predicted labels.We run experiments five times for each task to eliminate randomness, and the mean value of overall accuracy are used to assess the final results.The configuration settings for training and testing procedure the model are referenced in [53].The performance of UTL task is varies according to methods and input types.To make it convenient to follow, the two better results of task A-5 and A-6 are in a comparison demonstrated in Figure 12.We can observe that the over accuracies of different methods and input types are all in range of 70.5% to 77.2%.It is clear that the knowledge-transfer ability of the different methods do not differ too much, and this is also true for the two types of input.
From these results we cannot say which method or input is more efficient than others in the UTL problem.The only thing we can conclude is that the data set shows the ability to excel in the unsupervised transfer learning task.Because our data set has good knowledge transfer capacity, so the data and results presented in this study can be referenced for studies related to improving transfer learning performance in the future.The point of interest in the unsupervised transfer learning experiments was the outcome of tasks A-7 and A-8.The results of these tasks are detailed in Figure 13.Tasks A-7 and A-8 have significantly lower accuracy ground than A-5 and A-6, ranges from 60.7 to 74.3%.This is because the 6207 and 2608 bearings are much different in size than the 2604 bearings compared to the 6205 and 6206 bearings.For task A-7, it is easy to see that the model-based and CORAL method do not have too much difference in accuracy (about 70%) when given different inputs.
Meanwhile, for the remaining models, the transfer accuracy differs by about 6-8% between the input in the time domain and the input in the frequency domain.In particular, the frequency input is better for instance-based and JMMD model, and the opposite trend is true for MK-MMD model.For task A-8, a feature similar to other tasks is that the accuracy gap between models is relatively small even though the inputs are different.However, the instance-based method had unusual results with over 83% accuracy.This is because this method is really suitable for knowledge transfer tasks from bearing 6204 to 6208 data.Indeed, when the 6204 bearing data is removed from the source domain, the accuracy of the instance-based method for A-8 are dropped to 64.7% and 64.9%.In summary, for tasks A-7 and A-8, it is also impossible to conclude which method is more effective than the other.The features in the time domain are visualized by the t-SNE algorithm as shown in Figure 14 for reference.

Conclusion
A novel dataset for intelligent fault diagnosis of ball bearings called HUST bearing has been published in this paper.Classical signal processing and analysis algorithms for bearing fault diagnosis were tested to provide an analytical benchmark for the dataset.It has been found that traditional diagnostic methods are applicable to some data files in the dataset, others not.
Modern methods have demonstrated their excellent classification ability, demonstrating that the dataset carries useful information on bearing defects and can be used for fault diagnosis and prognostic applications.Moreover, the dataset performs excellent results on unsupervised transfer learning tasks and be promised for in-industrial applications.With respect to testing new diagnostic algorithms based on HUST bearing dataset, scientists are recommended to refer to the results presented in this paper.In the future, the HUST bearing dataset will be supplemented with other information related to operating conditions to expand the research orientations and practical applications.

Figure 2 :
Figure 2: Illustrations for defects on actual test bearings: (a) inner crack; (b) outer crack; (c) ball crack; (d) inner and outer cracks; (e) inner and ball cracks; (f) outer and ball cracks.
i.e., fault characteristics can be tracked by identifying the instantaneous frequency of an amplitude modulation signal.Therefore, in practice, time-frequency analyses are used in order to observe the presence of faults.The objective of the time-frequency analyses is to analyze how the frequency content of a nonstationary signal changes over time.In this examination, the Hilbert-Huang transform (HHT)[25] is applied to the angular resample vibration signal to detect bearing defects in the run-up.Compared to other techniques such as short-time Fourier transform, Morlet wavelet transform, or Wigner-Ville distribution, the Hilbert-Huang transform is better in localizing the energy in both time and frequency domains[26].The Hilbert-Huang transform consists of two main step: (i) decompose the original using the empirical mode decomposition method; (ii) use Hilbert transform to compute the instantaneous frequency.

4. 1 .
Feature extraction and selection ❖ Time domain A series of digital values representing displacement, velocity, or acceleration in the time domain are obtained as vibration signals.Statistical time-domain features such as mean, root mean square (RMS), and variance are frequently utilized to identify the discrepancy between one
well-known methods according to three categorize of UTL: network-based, instance-based, and feature-based.Then, dataset evaluating experiments are conducted under those methods to have an overview of the dataset's performance on ULT.❖ Network-based UTL The basic concept of network-based UTL indicate that the partial network of a pretrained model is reused in another model.The pretrained model refers to the model established using training data in source domain.After training process, the feature extraction part of the model is cut out and then fit to another classifier in order to build a target network for data in target domain.Ifwe have a small labeled data in target domain, it will be use to fine-tune the target network to improve its prediction accuracy.However, in this work, the data labels in target domain are assumed all unavailable so that training the target network is impossible.For that reason, the pretrain network is used to test the data in target domain directly i.e., the pretrained model and is parameters are shared the same between source domain and target domain.The base architecture of pretrained model or backbone is illustrated in Figure10and for fair comparisons, all experiments related to UTL in this work are performed on the backbone.

Figure 10 :
Figure 10: Backbone architecture illustration.Each convolution layer has its kernel size (k), number of features (n), and stride (s).

Figure 11 :
Figure 11: Feature-based UTL models with different distance functions.

Figure 12 :
Figure 12: Accuracy comparisons of five UTL methods with time and frequency-domain input (a) Task A-5; (b) Task A-6.

Figure 13 :
Figure 13: Accuracy comparisons of five UTL methods with time and frequency-domain input (a) Task A-7; (b) Task A-8.

Figure 14 :
Figure 14: t-SNE visualization of learned features on source domain and target domain.

Table 1 :
Comparison between popular bearing fault datasets addition, practical systems often have more than one bearing, so the problem of fault diagnosis for multiple bearings is urgent and highly practical.That means that data sets of diverse bearing types are required for multi-bearing fault diagnosis.The reasons mentioned above are motivations for us to publish a new data set of vibration signals of various types of defects on different types of bearings.The dataset provided in this paper is a helpful reference source for research on intelligent bearing fault diagnosis based on learning methods.

Table 3 :
File name description of dataset

Table 5 :
Brief review of statistical time-domain features extraction formula

Table 6 :
[31]f review of statistical frequency-domain features extraction formulaFeatures regarding frequency domain are being widely examined at the present as more effective approach compared to time domain features.In order to acquire frequency domain features, the time domain vibration signals must be transformed into frequency domain using fast Fourier transform (FFT)[31].There are various types of bearing fault such as ball, inner race or outer race which can be detected because FFT presents the dominant frequency of the impulse period of specific defects when rolling elements contacts with faulty point.When fault appear in stationary periodic signals, the frequency elements such as mean frequency (MF), RVF both indicate the convergence degree of power spectrum.These frequency features can be computed as shown in Table6.In Table6,   is the k-th measurement of the frequency spectrum of signal x, M is the total number of spectrum lines and   is the frequency of the kth spectrum line.
frequency center (FC), root mean square frequency (RMSF), standard deviation frequency (STDF) and root variance frequency (RVF) will change.The MF presents the vibration energy of frequency domain signal whilst FC and RMSF show the changes of main frequency.The STDF,

Table 7 :
Classification performance of ML models with features in time domain

Table 8 :
Classification performance of ML models with features in frequency domain

Table 9 .
It can be seen that the class-specific accuracy in each model is relatively uniform.In particular, class N and OB, in all predictors, have the highest accuracy (both more than 90%).Meanwhile, the lowest accuracy belongs to class IB with the model accuracy rate of 40.2%, 89.1%, 89.3 and 95.6%.

Table 9 :
Classification performance of ML models with spectrogram images input

Table 10 :
Fault diagnosis UTL tasks of HUST bearing dataset