Concurrency – redundancy without central control points?
If you can provide services to multiple clients, if the server providing this service fails, another server will occupy it – without some centralized "control", it will detect whether the primary server is down and redirect clients to the new server?
Is it possible not to use a centralized interface / gateway?
In other words, it's a bit like asking, can you design a node balancer without centralized control to guide the client?
Solution
This answer outlines the high availability of web applications, not Erlang specific I don't know much about the available content in the OTP framework because I'm new to the language
Here are some different questions:
>Client connections must be moved to the backup computer > session may contain state data > how to detect crashes
Problem 1 - mobile client connection, which can be solved in many different ways and different layers of network architecture The simplest way is to encode it to the client, so that when the connection is lost, it will reconnect to another machine
If you need network transparency, you can use some technology to synchronize TCP status between different computers, and then reroute all traffic to the new computer, which may be completely invisible to the client This is much more difficult than the first suggestion
I believe there is a lot to be done between the two
Question 2 – status data you obviously need to move the session state from the crashed computer to the backup computer This is difficult to do reliably, and you may lose the last few transactions because a crashed computer may not be able to send the last status before it crashes You can use synchronous calls in this way to ensure that the state is not lost:
>Transaction / message from client to host. > The host updates some status. > Send the new status to the backup computer. > The backup machine confirms the arrival of the new state. > The host confirms that the customer is successful
In some cases, this can be expensive (or at least not responsive) because you rely on the backup computer and its connections, including latency, even before confirming anything to the client For better performance, you can let the client check the transactions received by the backup machine when connecting, and then resend the lost transactions, making the client responsible for queuing
Question 3 - detect crashes this is an interesting question because crashes are not always clear What really broke down? Consider a network program that closes the connection between the client and the server, but they are still started and connected to the network Or worse, if the server doesn't notice, the client will disconnect from the server Here are some issues to consider:
>Should the client connect to the backup machine? > If the master server updates some status and sends it to the backup computer, and the backup is connected to the real client, will there be data competition? > Can the host and standby machines be started at the same time, or do you need to shut down one of the machines and move all sessions? > You need some kind of authority on this issue. Some agreements to decide which is the master and which is the slave? Who is the authority? How do you disperse it? > What if your nodes lose their connection, but both continue to work as expected (called network partitioning)?