Analysis of openfire cluster source code

This article introduces the relevant contents of openfire. It seems that there are not many people using this thing now. Forget it, let's see the details.

What is openfire?

Openfire is developed in Java, The open source real-time collaboration (RTC) server is based on XMPP (jabber) protocol. Openfire is easy to install and use, and can be managed using the web. A single server can support tens of thousands of concurrent users. Because it adopts the open XMPP protocol, you can log in to the service using various IM client software supporting XMPP protocol. If you want to easily build an efficient instant messaging server, choose it!

What can openfire do?

To understand openfire, we must first understand the XMPP protocol, because openfire is written in Java language, an open-source real-time cooperative server based on XMPP protocol. Openfire has the ability of cross platform. Openfire and the client adopt the C / S architecture. A server should be responsible for providing services for the clients connected to it. Openfire clients include spark, Pidgin, Miranda IM, iChat, etc. if users develop their own clients, they can use smack, an open-source client API that follows the GPL. Openfire server supports plug-in development. If developers need to add new services, they can develop their own plug-ins and install them into the server to provide services. For example, the contact search service is provided in the form of plug-ins.

If the number of openfire users increases, clusters need to be introduced to solve the throughput problem. Openfire provides cluster support. In addition, two cluster plug-ins are also implemented: hazelcast and clustering. In order to understand the working principle of cluster, I analyzed the source code of openfire, which is also a learning process. First, understand some simple concepts of cluster

The purpose of cluster is to make multiple instances run like one instance, so that the computing power can be increased by increasing instances. That is the so-called distributed computing problem, one of the most concerned features is the cap theory, that is, the so-called consistency, availability and partition fault tolerance. The core problem to be solved in cluster is cap.

The comprehensive understanding of cap is what I wrote above. Multiple instances run like one instance.

Therefore, the so-called cluster is to share or synchronize some data to different instances. In this way, the system uses the same algorithm, and the results should be the same of course. Therefore, the master-slave replication and cache data cluster of some databases are similar to this solution. It's just a matter of code implementation quality and processing scale.

With this foundation, let's take a look at how openfire solves this problem.

Cluster design of openfire

1. What needs to be synchronized between clusters

For openfire, there are several aspects of data that need to be synchronized between clusters: data stored in the database, cached data and session. Looks like that's all?

database

Because openfire is basically transparent, it is left to the database itself.

Cache data

The cache is stored in memory, so this part needs to be synchronized

session

Session does not need to synchronize all instances in openfire, but user routing cache is required, otherwise the corresponding session cannot be found when sending messages. Therefore, the user route still needs to be synchronized.

2. Cache design

Cache interface

Openfire provides a wrapper interface for the cached data container, which provides the basic method of caching data for unified data operation.

publicinterface Cache
extends java. util. Map

If the cluster is not enabled, the default cache container class for caching is: public class defaultcache < K, V >. In fact, defaultcache uses a HashMap to store data.

Cache factory class

To ensure that the cache can be extended, a factory class is provided:

publicclass CacheFactory

All cache containers are managed in the cachefactory class, as shown in the following code:

In the above code, a cache container will be created by caching the factory policy object. Finally, the warpcache method will put this container into caches.

Policy for caching factory classes

In cachefactory, a defaultlocalcachestrategy is used by default to complete cache creation. In addition, it also provides cache policy access under cluster conditions. That is, the cache management scheme is switched by instantiating different policies. For example, hazel cast, which will be mentioned later, uses this to replace the local cache policy. From the interface design point of view, openfire's cache strategy is for the implementation of cluster and non cluster.

3. Cluster design

Clusters in openfire mainly include cluster management, data synchronization management and cluster computing tasks.

Cluster manager

In openfire, it is mainly implemented by a class: clustermanager, which implements the join and exit management of cluster instances. Because the master-slave structure is not used, clustermanager implements a centerless management. I don't know if my understanding is correct. As long as the cluster is enabled for the current instance, the cluster manager will actively load cluster management and synchronize with other clusters.

startup

Startup is the method to start the cluster. Code:

First, judge whether the cluster is started and the current cluster instance is not running.

First, the event distributor is initialized to handle the synchronization of the cluster.

Then it calls the startclustering of cachefactory to run the cluster. These are the main things in the startclustering method:

It will use the cache factory policy of the cluster to start and add itself to the cluster.

Start a thread to synchronize the state of the cache

In the initeventdispatcher method in the previous startup, a distribution thread will be registered here to listen to cluster events. After receiving the events, it will execute joinedcluster or leftcluster operations. Joinedcluster means to join the cluster.

Local cache containers will be converted to cluster cache when joinedcluster. This completes the initialization of the cluster and adds it to the cluster.

shutdown

The relatively simple point of shutdown is to exit the cluster and restore the cache factory to the local cache.

Synchronization management

The above mainly talks about how to manage clusters, and then more importantly, how to synchronize data between clusters? This part mainly depends on the implementation of the specific distributed computing system. From openfire, it is to put the data into the cluster cache, and then complete it through the cluster components, such as using hazelcast.

Because cache is used to solve this problem, there are so many cluster processing codes in cachefactory, especially for cache policy switching and cluster task processing, which are exposed as interface methods in cachefactory. This also makes the implementation of the cluster transparent.

Cluster computing tasks

Before that, the computing problem in the cluster has not been mentioned, because now that there is a cluster, can we make use of the advantages of the cluster for some parallel computing? I'm not too sure about this part. I just see the relevant code, so I'll list it briefly.

There are several methods in the cachefactory class: doclustertask and dosynchronousclustertask. These two methods are overload methods with different parameters. These methods are used to perform some computing tasks. Take a look at doclustertask:

One limitation here is that it must be a class derived from clustertask. See its definition:

It is mainly for asynchronous execution and serialization. Asynchronous is because it cannot be blocked, and serialization is, of course, to be transmitted in the cluster.

If you look at the doclustertask method of cachefactory, you can find that it just proxies the doclustertask of the cache policy factory. The specific implementation depends on the implementation of the cluster.

Take a look at the implementation of hazelcast and briefly understand the openfire cluster

There is a plug-in implementation of cluster in openfire. Here, take hazelcast as an example to make a simple analysis and learning.

Cache policy factory class (clusteredcachefactory)

Clusteredcachefactory implements cachefactorystrategy. The code is as follows:

First, the startcluster method is used to start the cluster. It mainly completes several things:

Set the cache serialization tool class, clusterexternalizableutil. This is a serialization tool for data replication between clusters

Set the remote session locator, remotesessionlocator. Because sessions are not synchronized, it is mainly used for session reading between multiple instances

Set up the remote packet router clusterpacketrouter so that messages can be sent in the cluster

Load the instance of hazelcast, set nodeid, and set clusterlistener

When talking about cluster startup, I mentioned cache switching. How do you do it?

Because after the cluster is started, it will be cachefactory Joinedcluster method to join the cluster. Take a look at the added code:

Here you can see that all cache containers will be read and wrapped one by one with wrappers, and then a new cache will be created with the same cache name. This step uses the switched cluster cache policy factory, that is, the clusteredcachefactory will be used to create a new cache container. Finally, write the cache to the new clusteredcache to complete the cache switching.

Of course, let's take a look at the createcache implementation of clusteredcachefactory:

Clusteredcache is used here, and the most important thing is that the second map parameter passed in is changed to hazelcast's, so that when accessing the cache container later, it is no longer the original local cache, but the map object of hazelcast. Hazel cast will automatically synchronize the map data, which completes the cache synchronization function.

Cluster computing

Let's look at the implementation of hazelcast. Take doclustertask in clusteredcachefactory as an example

The process is to obtain the instance members in the cluster first, and of course, exclude yourself. Then hazelcast provides executorservice to execute the task. The method is submitetomembers. An operation task is submitted. It's just that it's not clear how to allocate calculations and collect results.

summary

I spent a day looking at the openfire cluster and wrote an article. I did find something. In communication with some netizens, it seems that at present, we prefer to use redies to complete cache sharing and cluster through agents rather than openfire's cluster scheme. In this part, I didn't encounter how much concurrency is required. I really don't know the difference. Try to write a redies plug-in when you have a chance in the future.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>