Using BitSet to realize millisecond query (example explanation)
preface
Because the service requires that the response time of one request of API is less than 10ms, the traditional database query operations are directly excluded (network IO and disk IO). Through investigation, bieset is finally used, which has been running normally for a long time
BitSet introduction
I'm confused by the explanation in JDK. I'll summarize it with my own understanding
1. The internal implementation of BitSet is long array 2 The default value of each bit in set is false (0). 3. BitSet length increases as needed. 4. BitSet is not thread safe
Analysis of key methods of BitSet
Set the specified bit to true and the bitindex parameter to a non negative integer. Suppose we execute the following code and observe the changes of wordindex and words [wordindex] values in the above code
From the above table, we can clearly draw a conclusion according to the corresponding relationship between bitindex and words [wordindex] binary values, that is, each long in BitSet can indicate whether 64 non negative integers exist in BitSet. For example, 0001 can indicate that the integer 0 exists in BitSet, and 1111 can indicate that the integer 3,2,1,0 exists in BitSet.
Enter the topic and realize BitSet millisecond query
Imagine a scenario where we have a user table
Suppose we want to query "18-year-old girls in Beijing", the corresponding SQL is as follows:
How to use BitSet to implement the same query?
1. Load the user table data into memory
2. Create BitSet indexes of address, age and gender dimensions for the user table
3. Query data according to index
1. Load the user table data into memory
Read the user table from the database and store it in the list
User entity class:
2. Indexing
Create BitSet index model class
Create address, gender and age dimension indexes for each user object
3. Test BitSet
Test results (query 2W times):
Data volume (users. Size()) | concurrent number | average query time
---|---|--- 20w | 10 | 1ms 50w | 20 | 3ms 100w| 50 | 9ms
The tester is ThinkPad x240 i5 8g memory
4. Summary
==Advantages = =:
Through the test, it is found that with the increase of the amount of data and the number of concurrency, the average time-consuming does not increase significantly, and the response time is less than 10ms
==Disadvantages = =:
1. It is not suitable for the case of too large amount of data, because it is necessary to load all the data into the memory
2. Not suitable for complex query
3. It is not suitable for querying unique values such as name and ID
Postscript
Because our query business is relatively simple, the only requirement is speed, and the amount of data is small. The amount of data in each table does not exceed 100W, so this method is more appropriate.
In this article, I only talk about how to create indexes and the most basic queries. In the next article, I will continue to explain how to update indexes and some complex queries, such as <, >, between and.
The above article using BitSet to realize millisecond query (example explanation) is all the content shared by Xiaobian. I hope it can give you a reference and support programming tips.