Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 362 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 17.1 KiB |
Average record size in memory | 48.4 B |
Variable types
Numeric | 6 |
---|
T_Machine is highly correlated with T_Ambient | High correlation |
T_Ambient is highly correlated with T_Machine | High correlation |
Reproduction
Analysis started | 2021-05-28 13:50:23.083630 |
---|---|
Analysis finished | 2021-05-28 13:50:36.400860 |
Duration | 13.32 seconds |
Software version | pandas-profiling v2.11.0 |
Download configuration | config.yaml |
[3,0]-[3,1]
Real number (ℝ≥0)
Distinct | 333 |
---|---|
Distinct (%) | 92.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 9.914286959 |
---|---|
Minimum | 9.898655 |
Maximum | 9.958421 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 9.898655 |
---|---|
5-th percentile | 9.90359895 |
Q1 | 9.908607 |
median | 9.9133915 |
Q3 | 9.91721 |
95-th percentile | 9.9319833 |
Maximum | 9.958421 |
Range | 0.059766 |
Interquartile range (IQR) | 0.008603 |
Descriptive statistics
Standard deviation | 0.008505248962 |
---|---|
Coefficient of variation (CV) | 0.0008578780297 |
Kurtosis | 3.646080581 |
Mean | 9.914286959 |
Median Absolute Deviation (MAD) | 0.0043255 |
Skewness | 1.478415386 |
Sum | 3588.971879 |
Variance | 7.233925991 × 105 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
9.920238 | 3 | 0.8% |
9.912914 | 2 | 0.6% |
9.91721 | 2 | 0.6% |
9.920589 | 2 | 0.6% |
9.919925 | 2 | 0.6% |
9.913948 | 2 | 0.6% |
9.914144 | 2 | 0.6% |
9.915686 | 2 | 0.6% |
9.91428 | 2 | 0.6% |
9.909066 | 2 | 0.6% |
Other values (323) | 341 |
Value | Count | Frequency (%) |
9.898655 | 1 | |
9.899163 | 1 | |
9.899769 | 1 | |
9.901156 | 1 | |
9.901605 | 1 |
Value | Count | Frequency (%) |
9.958421 | 1 | |
9.952874 | 1 | |
9.943948 | 1 | |
9.941781 | 1 | |
9.940745 | 1 |
[3,2]-[3,3]
Real number (ℝ≥0)
Distinct | 346 |
---|---|
Distinct (%) | 95.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 9.927869425 |
---|---|
Minimum | 9.907732 |
Maximum | 10.022231 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 9.907732 |
---|---|
5-th percentile | 9.9169958 |
Q1 | 9.92261725 |
median | 9.926241 |
Q3 | 9.92997725 |
95-th percentile | 9.94372665 |
Maximum | 10.022231 |
Range | 0.114499 |
Interquartile range (IQR) | 0.00736 |
Descriptive statistics
Standard deviation | 0.01032907148 |
---|---|
Coefficient of variation (CV) | 0.001040411698 |
Kurtosis | 24.20819171 |
Mean | 9.927869425 |
Median Absolute Deviation (MAD) | 0.003722 |
Skewness | 3.656828939 |
Sum | 3593.888732 |
Variance | 0.0001066897177 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
9.923185 | 3 | 0.8% |
9.929455 | 2 | 0.6% |
9.932345 | 2 | 0.6% |
9.922461 | 2 | 0.6% |
9.927834 | 2 | 0.6% |
9.926154 | 2 | 0.6% |
9.924533 | 2 | 0.6% |
9.926291 | 2 | 0.6% |
9.942385 | 2 | 0.6% |
9.929337 | 2 | 0.6% |
Other values (336) | 341 |
Value | Count | Frequency (%) |
9.907732 | 1 | |
9.908005 | 1 | |
9.910857 | 1 | |
9.910896 | 1 | |
9.911853 | 1 |
Value | Count | Frequency (%) |
10.022231 | 1 | |
9.984808 | 1 | |
9.975472 | 1 | |
9.975159 | 1 | |
9.971684 | 1 |
Distinct | 31 |
---|---|
Distinct (%) | 8.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20.84889503 |
---|---|
Minimum | 19.1 |
Maximum | 22.8 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 19.1 |
---|---|
5-th percentile | 20 |
Q1 | 20.4 |
median | 20.7 |
Q3 | 21.1 |
95-th percentile | 22 |
Maximum | 22.8 |
Range | 3.7 |
Interquartile range (IQR) | 0.7 |
Descriptive statistics
Standard deviation | 0.6394348223 |
---|---|
Coefficient of variation (CV) | 0.0306699622 |
Kurtosis | 0.6252118202 |
Mean | 20.84889503 |
Median Absolute Deviation (MAD) | 0.3 |
Skewness | 0.6679942001 |
Sum | 7547.3 |
Variance | 0.408876892 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=31)
Value | Count | Frequency (%) |
20.7 | 51 | |
20.3 | 34 | 9.4% |
20.9 | 30 | 8.3% |
21 | 26 | 7.2% |
20.5 | 24 | 6.6% |
20.8 | 21 | 5.8% |
20.6 | 15 | 4.1% |
20.1 | 15 | 4.1% |
20.4 | 15 | 4.1% |
21.1 | 14 | 3.9% |
Other values (21) | 117 |
Value | Count | Frequency (%) |
19.1 | 2 | 0.6% |
19.6 | 5 | |
19.7 | 2 | 0.6% |
19.9 | 8 | |
20 | 5 |
Value | Count | Frequency (%) |
22.8 | 2 | 0.6% |
22.7 | 2 | 0.6% |
22.6 | 4 | |
22.3 | 3 | |
22.2 | 6 |
Distinct | 32 |
---|---|
Distinct (%) | 8.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 21.74309392 |
---|---|
Minimum | 20.6 |
Maximum | 24.2 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 20.6 |
---|---|
5-th percentile | 21 |
Q1 | 21.4 |
median | 21.7 |
Q3 | 22 |
95-th percentile | 22.9 |
Maximum | 24.2 |
Range | 3.6 |
Interquartile range (IQR) | 0.6 |
Descriptive statistics
Standard deviation | 0.589538847 |
---|---|
Coefficient of variation (CV) | 0.02711384355 |
Kurtosis | 1.581465254 |
Mean | 21.74309392 |
Median Absolute Deviation (MAD) | 0.3 |
Skewness | 0.972089221 |
Sum | 7871 |
Variance | 0.3475560521 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=32)
Value | Count | Frequency (%) |
21.8 | 46 | |
21.6 | 43 | |
22 | 26 | 7.2% |
21.2 | 25 | 6.9% |
21.4 | 25 | 6.9% |
21 | 24 | 6.6% |
21.7 | 23 | 6.4% |
21.9 | 21 | 5.8% |
22.2 | 16 | 4.4% |
21.3 | 15 | 4.1% |
Other values (22) | 98 |
Value | Count | Frequency (%) |
20.6 | 2 | 0.6% |
20.7 | 5 | 1.4% |
20.8 | 3 | 0.8% |
20.9 | 7 | 1.9% |
21 | 24 |
Value | Count | Frequency (%) |
24.2 | 1 | |
23.8 | 1 | |
23.7 | 2 | |
23.6 | 1 | |
23.5 | 2 |
T_KJ_LT
Real number (ℝ≥0)
Distinct | 36 |
---|---|
Distinct (%) | 9.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 19.93756906 |
---|---|
Minimum | 18.5 |
Maximum | 22.4 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 18.5 |
---|---|
5-th percentile | 18.9 |
Q1 | 19.5 |
median | 19.9 |
Q3 | 20.4 |
95-th percentile | 21.2 |
Maximum | 22.4 |
Range | 3.9 |
Interquartile range (IQR) | 0.9 |
Descriptive statistics
Standard deviation | 0.7174691221 |
---|---|
Coefficient of variation (CV) | 0.03598578743 |
Kurtosis | 0.1293453863 |
Mean | 19.93756906 |
Median Absolute Deviation (MAD) | 0.5 |
Skewness | 0.5312054726 |
Sum | 7217.4 |
Variance | 0.5147619412 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=36)
Value | Count | Frequency (%) |
19.7 | 36 | 9.9% |
19.5 | 29 | 8.0% |
19.9 | 26 | 7.2% |
19.3 | 24 | 6.6% |
20.2 | 22 | 6.1% |
20.4 | 20 | 5.5% |
19.1 | 18 | 5.0% |
20.8 | 17 | 4.7% |
20.3 | 15 | 4.1% |
20.1 | 14 | 3.9% |
Other values (26) | 141 |
Value | Count | Frequency (%) |
18.5 | 1 | 0.3% |
18.6 | 5 | |
18.7 | 7 | |
18.8 | 2 | 0.6% |
18.9 | 9 |
Value | Count | Frequency (%) |
22.4 | 1 | 0.3% |
22 | 2 | 0.6% |
21.9 | 1 | 0.3% |
21.8 | 1 | 0.3% |
21.6 | 6 |
T_KJ_HT
Real number (ℝ≥0)
Distinct | 34 |
---|---|
Distinct (%) | 9.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 16.89834254 |
---|---|
Minimum | 15.6 |
Maximum | 19.4 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 3.0 KiB |
Quantile statistics
Minimum | 15.6 |
---|---|
5-th percentile | 16 |
Q1 | 16.4 |
median | 16.7 |
Q3 | 17.3 |
95-th percentile | 18.295 |
Maximum | 19.4 |
Range | 3.8 |
Interquartile range (IQR) | 0.9 |
Descriptive statistics
Standard deviation | 0.7050255011 |
---|---|
Coefficient of variation (CV) | 0.04172157709 |
Kurtosis | 0.6175425358 |
Mean | 16.89834254 |
Median Absolute Deviation (MAD) | 0.4 |
Skewness | 0.8469289407 |
Sum | 6117.2 |
Variance | 0.4970609571 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=34)
Value | Count | Frequency (%) |
16.6 | 35 | 9.7% |
16.7 | 30 | 8.3% |
16.2 | 26 | 7.2% |
16.4 | 26 | 7.2% |
16.9 | 25 | 6.9% |
17.1 | 21 | 5.8% |
17.3 | 20 | 5.5% |
17 | 16 | 4.4% |
17.4 | 15 | 4.1% |
16.3 | 15 | 4.1% |
Other values (24) | 133 |
Value | Count | Frequency (%) |
15.6 | 3 | 0.8% |
15.7 | 2 | 0.6% |
15.8 | 1 | 0.3% |
15.9 | 10 | |
16 | 15 |
Value | Count | Frequency (%) |
19.4 | 1 | 0.3% |
19.1 | 1 | 0.3% |
19 | 2 | |
18.9 | 3 | |
18.8 | 2 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
[3,0]-[3,1] | [3,2]-[3,3] | T_Machine | T_Ambient | T_KJ_LT | T_KJ_HT | |
---|---|---|---|---|---|---|
0 | 9.911077 | 9.926153 | 20.8 | 21.9 | 19.6 | 16.9 |
1 | 9.909690 | 9.930274 | 20.9 | 22.0 | 19.7 | 16.6 |
2 | 9.906761 | 9.922285 | 20.9 | 21.9 | 20.3 | 17.6 |
3 | 9.903597 | 9.933068 | 21.0 | 22.0 | 19.9 | 16.6 |
4 | 9.903773 | 9.920216 | 21.0 | 22.0 | 19.9 | 16.6 |
5 | 9.904124 | 9.918399 | 21.0 | 22.0 | 19.9 | 17.1 |
6 | 9.908578 | 9.922735 | 20.9 | 21.8 | 19.7 | 16.4 |
7 | 9.912581 | 9.929337 | 20.9 | 21.8 | 20.3 | 17.3 |
8 | 9.916663 | 9.937149 | 20.8 | 21.6 | 20.3 | 16.7 |
9 | 9.905785 | 9.920762 | 20.8 | 21.6 | 20.3 | 16.6 |
Last rows
[3,0]-[3,1] | [3,2]-[3,3] | T_Machine | T_Ambient | T_KJ_LT | T_KJ_HT | |
---|---|---|---|---|---|---|
352 | 9.917836 | 9.917068 | 21.2 | 21.8 | 19.9 | 17.0 |
353 | 9.912288 | 9.920448 | 21.2 | 21.9 | 19.9 | 16.7 |
354 | 9.904105 | 9.917224 | 21.2 | 22.0 | 20.2 | 17.1 |
355 | 9.902522 | 9.914998 | 21.3 | 22.2 | 20.6 | 17.1 |
356 | 9.904671 | 9.944373 | 21.4 | 22.2 | 20.6 | 17.2 |
357 | 9.901156 | 9.920994 | 21.4 | 22.2 | 20.6 | 17.2 |
358 | 9.906801 | 9.913201 | 21.4 | 22.2 | 20.1 | 17.0 |
359 | 9.902932 | 9.911853 | 21.3 | 22.2 | 20.1 | 17.0 |
360 | 9.914182 | 9.931991 | 20.9 | 22.1 | 20.8 | 18.0 |
361 | 9.904944 | 9.920311 | 20.9 | 22.1 | 20.8 | 18.0 |