Overview

Dataset statistics

Number of variables5
Number of observations371
Missing cells0
Missing cells (%)0.0%
Duplicate rows19
Duplicate rows (%)5.1%
Total size in memory14.6 KiB
Average record size in memory40.3 B

Variable types

Numeric5

Warnings

Dataset has 19 (5.1%) duplicate rows Duplicates

Reproduction

Analysis started2021-05-28 13:53:33.106774
Analysis finished2021-05-28 13:53:37.866704
Duration4.76 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Bearing 1
Real number (ℝ≥0)

Distinct62
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.268463612
Minimum1.3
Maximum14
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB
2021-05-28T15:53:38.016792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.3
5-th percentile4.1
Q14.7
median6.1
Q37
95-th percentile9.7
Maximum14
Range12.7
Interquartile range (IQR)2.3

Descriptive statistics

Standard deviation1.986277943
Coefficient of variation (CV)0.3168683852
Kurtosis3.686907958
Mean6.268463612
Median Absolute Deviation (MAD)1.2
Skewness1.589447026
Sum2325.6
Variance3.945300066
MonotocityNot monotonic
2021-05-28T15:53:38.235552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.617
 
4.6%
6.114
 
3.8%
4.714
 
3.8%
6.313
 
3.5%
5.113
 
3.5%
6.813
 
3.5%
6.712
 
3.2%
4.411
 
3.0%
4.111
 
3.0%
4.311
 
3.0%
Other values (52)242
65.2%
ValueCountFrequency (%)
1.31
 
0.3%
3.62
0.5%
3.72
0.5%
3.84
1.1%
3.92
0.5%
ValueCountFrequency (%)
145
1.3%
135
1.3%
124
1.1%
114
1.1%
9.91
 
0.3%

Bearing 2
Real number (ℝ≥0)

Distinct60
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.71239892
Minimum1.3
Maximum47
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB
2021-05-28T15:53:38.747214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.3
5-th percentile7.5
Q110
median12
Q316
95-th percentile35.5
Maximum47
Range45.7
Interquartile range (IQR)6

Descriptive statistics

Standard deviation8.561728634
Coefficient of variation (CV)0.5819396741
Kurtosis4.348909722
Mean14.71239892
Median Absolute Deviation (MAD)2.3
Skewness2.153116908
Sum5458.3
Variance73.3031972
MonotocityNot monotonic
2021-05-28T15:53:38.950322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1259
15.9%
1152
 
14.0%
1332
 
8.6%
1420
 
5.4%
1018
 
4.9%
1615
 
4.0%
810
 
2.7%
1510
 
2.7%
1810
 
2.7%
179
 
2.4%
Other values (50)136
36.7%
ValueCountFrequency (%)
1.31
 
0.3%
5.91
 
0.3%
6.63
0.8%
6.92
0.5%
72
0.5%
ValueCountFrequency (%)
472
0.5%
463
0.8%
452
0.5%
441
 
0.3%
433
0.8%

Bearing 3
Real number (ℝ≥0)

Distinct80
Distinct (%)21.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1410673854
Minimum0.029
Maximum0.42
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB
2021-05-28T15:53:39.153458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.029
5-th percentile0.056
Q10.0865
median0.12
Q30.18
95-th percentile0.285
Maximum0.42
Range0.391
Interquartile range (IQR)0.0935

Descriptive statistics

Standard deviation0.07340383147
Coefficient of variation (CV)0.5203458704
Kurtosis1.7163783
Mean0.1410673854
Median Absolute Deviation (MAD)0.041
Skewness1.237572961
Sum52.336
Variance0.005388122474
MonotocityNot monotonic
2021-05-28T15:53:39.372193image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1233
 
8.9%
0.1327
 
7.3%
0.1519
 
5.1%
0.1116
 
4.3%
0.1616
 
4.3%
0.215
 
4.0%
0.113
 
3.5%
0.1413
 
3.5%
0.2112
 
3.2%
0.1910
 
2.7%
Other values (70)197
53.1%
ValueCountFrequency (%)
0.0291
0.3%
0.0372
0.5%
0.0431
0.3%
0.0441
0.3%
0.0451
0.3%
ValueCountFrequency (%)
0.423
0.8%
0.382
0.5%
0.372
0.5%
0.351
 
0.3%
0.341
 
0.3%

Axial Front
Real number (ℝ≥0)

Distinct75
Distinct (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.89326146
Minimum0.4
Maximum40
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB
2021-05-28T15:53:39.606552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.4
5-th percentile3.7
Q17.65
median14
Q320
95-th percentile33
Maximum40
Range39.6
Interquartile range (IQR)12.35

Descriptive statistics

Standard deviation8.608605555
Coefficient of variation (CV)0.5780201725
Kurtosis-0.04253778304
Mean14.89326146
Median Absolute Deviation (MAD)6
Skewness0.6401550473
Sum5525.4
Variance74.1080896
MonotocityNot monotonic
2021-05-28T15:53:39.825264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2037
 
10.0%
1426
 
7.0%
1620
 
5.4%
1918
 
4.9%
2117
 
4.6%
1517
 
4.6%
1015
 
4.0%
1315
 
4.0%
3313
 
3.5%
2310
 
2.7%
Other values (65)183
49.3%
ValueCountFrequency (%)
0.41
0.3%
0.52
0.5%
0.82
0.5%
1.51
0.3%
1.61
0.3%
ValueCountFrequency (%)
403
 
0.8%
375
 
1.3%
3313
3.5%
325
 
1.3%
294
 
1.1%

Radial Front
Real number (ℝ≥0)

Distinct72
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.71428571
Minimum8.2
Maximum100
Zeros0
Zeros (%)0.0%
Memory size3.0 KiB
2021-05-28T15:53:40.059645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum8.2
5-th percentile15.5
Q142
median52
Q383.5
95-th percentile97
Maximum100
Range91.8
Interquartile range (IQR)41.5

Descriptive statistics

Standard deviation25.01380129
Coefficient of variation (CV)0.433407448
Kurtosis-0.9940357449
Mean57.71428571
Median Absolute Deviation (MAD)16
Skewness0.09606051909
Sum21412
Variance625.6902548
MonotocityNot monotonic
2021-05-28T15:53:40.262756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9422
 
5.9%
4619
 
5.1%
9018
 
4.9%
4416
 
4.3%
4511
 
3.0%
8711
 
3.0%
7611
 
3.0%
9710
 
2.7%
6110
 
2.7%
5210
 
2.7%
Other values (62)233
62.8%
ValueCountFrequency (%)
8.21
 
0.3%
8.81
 
0.3%
122
0.5%
132
0.5%
144
1.1%
ValueCountFrequency (%)
1001
 
0.3%
999
2.4%
9710
2.7%
961
 
0.3%
952
 
0.5%

Interactions

2021-05-28T15:53:33.484577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:33.681651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:33.884759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.096108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.273499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.461013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.648475image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.820338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:34.976579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:35.195317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:35.398452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:35.632808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:35.820293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.023405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.242118image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.460875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.663960image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.835822image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:36.992089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-05-28T15:53:37.179553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-05-28T15:53:40.450218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-28T15:53:40.684576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-28T15:53:40.918937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-28T15:53:41.153297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-05-28T15:53:37.499612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-28T15:53:37.749598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Bearing 1Bearing 2Bearing 3Axial FrontRadial Front
06.612.00.07020.039.0
16.612.00.07020.039.0
26.712.00.06321.034.0
36.112.00.03721.034.0
44.27.40.05321.034.0
55.29.70.07221.034.0
64.16.60.07321.034.0
74.16.60.07321.037.0
84.37.40.19027.045.0
94.27.80.13020.044.0

Last rows

Bearing 1Bearing 2Bearing 3Axial FrontRadial Front
36114.047.00.12017.018.0
3624.513.00.03720.023.0
3635.413.00.04820.015.0
3646.113.00.14018.014.0
3656.713.00.15020.015.0
3666.913.00.16020.018.0
3673.77.20.08718.017.0
3683.88.00.07920.015.0
3693.88.00.07920.015.0
3704.78.20.29021.013.0

Duplicate rows

Most frequent

Bearing 1Bearing 2Bearing 3Axial FrontRadial Frontcount
157.812.00.12033.094.04
03.88.00.07920.015.02
13.99.00.1906.440.02
24.27.00.15020.043.02
34.411.00.2607.746.02
44.613.00.1807.454.02
54.712.00.15025.056.02
65.112.00.2009.943.02
75.114.00.22011.046.02
85.811.00.07319.047.02