How to Find and Resolve Illegal NaN Values in a Table
Summary
The customer confirms that they are seeing inconsistencies between TiKV and TiFlash with version 4.0.1 through 4.9. An internal investigation reveals that the issue is that TiKV and TiFlash have different behavior when an illegal NaN
value is inserted into a table.
Diagnosis
In summary, this is how we reproduced a NaN insertion (please refer to the attached files col1.sql, pk.sql, and ins.go).
#install TiDB version 4.0
$ tiup playground v4.0.0
#inside mysql shell, create a simple test table
create table t ( c double not null);
#then run the code to do a NaN insertion
$ go run ins.go
# and test the result
select * from t;
+-----+
| c |
+-----+
| NaN |
+-----+
You have a new table named t
with a single value of NaN
stored inside of it.
Solution
If we do not know where in the database the NaN insertion has occurred, We need to use the following query to find all the numeric columns in the database.
$ cat col1.sql
select concat("select * from ",
col.table_schema,".", col.table_name, " t1 inner join ",
col.table_schema,".", col.table_name, " t2 on t1.",
col.column_name, " != t2.",
col.column_name, " AND ",
"t1.@pk = t2.@pk;") as stmt
from information_schema.columns col
join information_schema.tables tab on tab.table_schema = col.table_schema
and tab.table_name = col.table_name
and tab.table_type = 'BASE TABLE'
where col.data_type in ('tinyint', 'smallint', 'mediumint',
'int', 'bigint', 'decimal', 'bit',
'float', 'double')
and col.table_schema not in ('information_schema', 'sys',
'performance_schema', 'mysql')
and col.table_schema = 'test' -- put your database name here
order by col.table_schema,
col.table_name,
col.ordinal_position;
#invoke it from mysqld as follows
source col1.sql;
+--------------------------------------------------------------+
| stmt |
+--------------------------------------------------------------+
| select * from test.t t1 inner join test.t t2 on t1.c != t2.c |
+--------------------------------------------------------------+
1 row in set (0.02 sec)
# next, we use the fact that NaN != NaN to find rows that contain NaN values
select * from test.t t1 inner join test.t t2 on t1.c != t2.c;
+-----+-----+
| c | c |
+-----+-----+
| NaN | NaN |
+-----+-----+
1 row in set (0.00 sec)
# We will also need to find the primary keys for each table whose column was selected in
# the above query. Refer to the attached file pk.sql for the query.
select C.TABLE_NAME, C.COLUMN_NAME FROM
INFORMATION_SCHEMA.TABLE_CONSTRAINTS T
join INFORMATION_SCHEMA.columns C
WHERE
T.TABLE_SCHEMA='test' AND
T.table_name=C.table_name AND
T.CONSTRAINT_TYPE='PRIMARY KEY' AND
C.column_key='PRI';
# Finally, we put it all together in order to find the rows that contain NaN values
# run on all the same database table comparison statement on out, each table will
# have a separate SQL statement file. In our testing, we used a table called record_id
# and another called users_user_id. you end up with two SQL files. as below:
record_id.sql
users_user_id.sql
# The overall procedure is straightforward. First, run pk.sql to get the primarykey column
# names of all the tables in the information schema, then run your col.sql to make the
# statements for each table, then use bash to replace @pk.sql according to the table:
#!/bin/bash
# get all the columns in the information schema that are of type numeric
mysql --comments --host 127.0.0.1 --port 4000 -u root <col.sql >stmt.txt
# get all the corresponding primary keys for each table
mysql --comments --host 127.0.0.1 --port 4000 -u root <pk.sql >primaryKey.txt
# remove the headers in each result file
tail -n +2 "stmt.txt" >stmt.txt.bak && mv stmt.txt.bak stmt.txt
tail -n +2 "primaryKey.txt" >primaryKey.txt.bak && mv primaryKey.txt.bak primaryKey.txt
#replace the @pk.sql with the primary key statement
while read line
do
echo $line
tn=$(echo $line | awk '{print $1}')
pk=$(echo $line | awk '{print $2}')
echo "$tn,$pk"
grep "${tn}" stmt.txt |sed "s/@pk/$pk/g" > ${tn}_$pk.sql
done < primaryKey.txt
The results for our simple tables will look like this:
% cat users_user_id.sql
select * from test.users t1 inner join test.users t2 on t1.user_id != t2.user_id AND t1.user_id = t2.user_id
% cat record_id.sql
select * from test.record t1 inner join test.record t2 on t1.id != t2.id AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.vc_trip_id != t2.vc_trip_id AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.date != t2.date AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.record_url_posted_to_vo != t2.record_url_posted_to_vo AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.total_auto_distance != t2.total_auto_distance AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.total_distance != t2.total_distance AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.total_auto_time != t2.total_auto_time AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.total_data_size != t2.total_data_size AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.total_time != t2.total_time AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.disengagement_num != t2.disengagement_num AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.intervention_num != t2.intervention_num AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.hard_brake_num != t2.hard_brake_num AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.uncomfortable_brake_num != t2.uncomfortable_brake_num AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.system_crash != t2.system_crash AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.vehicle_mass != t2.vehicle_mass AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.release_candidate != t2.release_candidate AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.auto_crash_num != t2.auto_crash_num AND t1.id = t2.id
select * from test.record t1 inner join test.record t2 on t1.has_changed_gflags != t2.has_changed_gflags AND t1.id = t2.id
Armed with the information listed above, which will naturally look different for your database, you can write a custom MySQL script to UPDATE the rows that contain NaN values and change them to a '0'.
note that there might be tables with the primary key column as the only numerical column, that is fine, the query statement will not return anything in this case.
Please take this script as a reference and note that it’s only for demonstration purposes, feel free to modify it in any way you like: e.g., you may want to merge all statements into one file instead of many.
Applies To
All TiDB releases
Tags
#TiKV #NaN #data_inconsistency #TiFlash
Contributed by: The PingCAP Support Team
Code listings
ins.go
package main
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
"log"
"math"
)
func main() {
db, err := sql.Open("mysql",
"root:@tcp(127.0.0.1:4000)/test")
if err != nil {
log.Fatal(err)
}
defer db.Close()
prepareStmt, err := db.Prepare("insert into t values(?)")
if err != nil {
log.Fatal(err)
}
var fo uint64 = 0xFFF0000000000001
f := math.Float64frombits(fo)
_, err = prepareStmt.Exec(f)
if err != nil {
log.Fatal(err)
}
}
col1.sql
use test;
select concat("select * from ",
col.table_schema,".", col.table_name, " t1 inner join ",
col.table_schema,".", col.table_name, " t2 on t1.",
col.column_name, " != t2.",
col.column_name, " AND ",
"t1.@pk = t2.@pk;") as stmt
from information_schema.columns col
join information_schema.tables tab on tab.table_schema = col.table_schema
and tab.table_name = col.table_name
and tab.table_type = 'BASE TABLE'
where col.data_type in ('tinyint', 'smallint', 'mediumint',
'int', 'bigint', 'decimal', 'bit',
'float', 'double')
and col.table_schema not in ('information_schema', 'sys',
'performance_schema', 'mysql')
and col.table_schema = 'test' -- put your database name here
order by col.table_schema,
col.table_name,
col.ordinal_position;
pk.sql
select C.TABLE_NAME, C.COLUMN_NAME FROM
INFORMATION_SCHEMA.TABLE_CONSTRAINTS T
join INFORMATION_SCHEMA.columns C
WHERE
T.TABLE_SCHEMA='test' AND
T.table_name=C.table_name AND
T.CONSTRAINT_TYPE='PRIMARY KEY' AND
C.column_key='PRI';
End of code listings
Comments
0 comments
Please sign in to leave a comment.