Distributed cache scheme
1、 Start with data
We need to classify the data before caching
By frequency of change:
By frequency of use:
This data is more suitable for caching.
Caching is everywhere. Memory can be regarded as a cache between CPU and disk. The processing speeds of CPU and memory are also inconsistent, so L1 & L2 caches appear
The essence of cache: the processing speed at all levels of the system does not match, and space is used for time.
Cache load time
1. Full load at startup
2. Lazy loading
2.1. Synchronous use loading
2.2. Delayed asynchronous loading
Get data from the cache and return it directly whether there is one or not.
Strategy 1: if the cache is empty, initiate an asynchronous thread to load.
Strategy 2: asynchronous threads are responsible for maintaining cached data and triggering updates regularly or according to conditions.
Cache expiration policy
2、 Local cache
public static final Map<String,Object> CACHE=new HashMap();
CACHE.put("key","value");
Cache<String,String> cache = CacheBuilder.newBuilder() .maximumSize(1024) .expireAfterWrite(60,TimeUnit.SECONDS) .weakValues() .build();
cache.put("word","Hello Guava Cache");
System.out.println(cache.getIfPresent("word"));
Core functions: @ cacheable, @ cacheput, @ cacheevict
Disadvantages of local caching:
3、 Remote cache
Redis is an open-source key value database written in ANSI C language, which can also be persisted based on memory, and provides APIs in multiple languages
Memcached is a distributed cache system developed by Brad Fitzpatrick of livejournal, but it is used by many websites. This is a set of open source software licensed by BSD license.
4、 Memory grid
5、 Cache FAQs
1. Cache penetration
Problem Description: a large number of concurrent queries do not exist, resulting in the direct transmission of pressure to the database.
Analysis: because there is no value in the database, the cache is not established, resulting in hitting the database all the time.
terms of settlement:
Bloom filter example: (Introducing guava dependency)
public static void main(String[] args) {
BloomFilter<CharSequence> filter = BloomFilter.create(
Funnels.stringFunnel(Charsets.UTF_8),//Funnels.integerFunnel(),//数据格式
1000000,//预计存入数据量
0.01);//误判率
System.out.println(filter.mightContain("abcdefg"));
filter.put("abcdefg");
System.out.println(filter.mightContain("abcdefg"));
}
Roaringbitmap example: introducing dependencies:
<dependency>
<groupId>org.roaringbitmap</groupId>
<artifactId>RoaringBitmap</artifactId>
<version>0.8.1</version>
</dependency>
public static void test3(){
Roaring64NavigableMap roaring64NavigableMap = Roaring64NavigableMap.bitmapOf(3,4,5,90);
//是否包含
boolean contains = roaring64NavigableMap.contains(3);
long l = roaring64NavigableMap.rankLong(3);
System.out.println(l);
System.out.println(contains);
}
2. Buffer breakdown
Problem: when a key fails, a large number of concurrent requests access the key
Analysis: similar to cache penetration, this is accidental
terms of settlement:
3. Cache avalanche
Problem: when a large-scale cache failure occurs at a certain time, a large number of requests will hit the database, resulting in excessive pressure and downtime of the database
Analysis: Generally speaking, due to the update strategy, data hotspot, cache service downtime and other reasons, the cached data cannot be stored on a large scale at the same time.
terms of settlement:
4. Update the database or delete the cache first?
4.1 delete the cache first, and then update the database
4.2 update the database before deleting the cache
To sum up, although there are inconsistencies, the second one is better handled. That is, select to update the database first and then delete the cache. The model is as follows:
If there is an exception when deleting the cache, you can put the task of deleting the cache into the message middleware and ask it to retry the deletion.
Fan Wai:
Bloom filter:
The goal is to compare and filter the original metadata stored and generated by the filter. If it is in the original metadata collection, it will be found. It's also possible that the one inside was not killed by mistake.
Bloomfilter will open up an m-bit bitarray (bit group). At first, all data will be deployed as 0. When an element comes, different values will be calculated through multiple hash functions, and then the corresponding subscript will be found according to the hash value, and the value inside will be changed to 1
Advantages: use computing to save storage space.
Disadvantages: error rate. Data not in the original table of the filter will also be miscalculated.
Usage scenario: the goal is to compare and filter the original metadata stored and generated by the filter. If it is in the original metadata collection, it will be found. The correct use of Bloom filter core is to filter prohibition and correct negation.
For example: if we have 1 million URL addresses in the blacklist, and we calculate that one address is not in it, we can certainly release it.
BitMap:
The basic idea of bitmap is to mark the corresponding value of an element with a bit bit, which can greatly save space.
In Java, an int takes up 4 bytes, that is, 32bit. The size difference between int storage and bit storage is 32 times.
So how do you represent a number? You can use 1 to indicate existence and 0 to indicate non existence.
As follows: indicates {2, 6}
A byte has only 8 positions. What if you want to represent 13? You can only use one byte, and it becomes a two-dimensional array
One int occupies 32 bits, so we only need to apply for an int array with the length of int TMP [1 + n / 32], where n represents the maximum value of these numbers to be stored
Usage scenario:
After you put the number in, go through it, take out all the ones with the value of 1, and then arrange the order.
Find the number of non repeating integers from 2 billion integers?
There is not enough memory to hold these 2 billion integers. How do we represent the state of numbers? The status of a number can be divided into three types: nonexistence, existence once, existence twice or more. This requires two bits to represent. 00 represents nonexistence, 01 represents one time, 11 represents two or more times.
Next, we will put these 2 billion integers in. If the status is 00, it will be changed to 01, and if the status is 01, it will be changed to 11 If the status is 11, it will not move. After all the data are released, traverse the data with the value of 01, that is, the number of non duplicate data.. 3. Quick search
Given an integer m, M / 32 can get the subscript of the int array, and M% 32 knows the specific position in this subscript.
For example, 13, you can calculate the 13th in int [0]