Java – how to identify whether a string contains special characters that cannot be stored using utf8-mb4 character sets
Please refer to this tweet and the thread below. We are trying to store similar tweets to the database I can't store this tweet in MySQL. I want to know how to recognize it. If the string contains a character that can't be processed by utf8-mb4 character set, I can avoid storing it
Solution
The character causing the problem for you is u 1f603 smoothing face with open mouth, and its value cannot be expressed in 16 bits When converted to UTF-8, the byte value is F0 9F 98 83, which should be suitable. There is no problem in the MySQL column of utf8mb4 character set, so I will agree with other commentators that it is not a MySQL problem If you can try to reinsert this tweet, please record all SQL statements received by Mysql to determine whether the characters are corrupted before or after sending to MySQL