working ha
This commit is contained in:
236
DISTRIBUTED_SETUP.md
Normal file
236
DISTRIBUTED_SETUP.md
Normal file
@@ -0,0 +1,236 @@
|
||||
# Distributed Phoenix WebSocket Game
|
||||
|
||||
This project demonstrates a distributed Phoenix application with automatic failover using native Erlang clustering.
|
||||
|
||||
## Architecture
|
||||
|
||||
- **3 Phoenix Nodes**: Running in Docker containers, forming a distributed Erlang cluster
|
||||
- **Global Process Registry**: Uses `:global` to ensure only one `GameState` GenServer runs across the cluster
|
||||
- **Automatic Failover**: If a node goes down, another node automatically takes over the GameState
|
||||
- **Nginx Load Balancer**: Routes WebSocket connections to healthy nodes
|
||||
- **Client Failover**: Frontend automatically switches to another server if connection is lost
|
||||
|
||||
## How It Works
|
||||
|
||||
### Distributed Erlang Clustering
|
||||
|
||||
- Each Phoenix container starts with a unique node name (e.g., `backend@phoenix1`)
|
||||
- All nodes share the same Erlang cookie for authentication
|
||||
- `Backend.Cluster` module automatically connects nodes on startup
|
||||
- Nodes use EPMD (Erlang Port Mapper Daemon) for discovery
|
||||
|
||||
### Singleton Game State
|
||||
|
||||
- `Backend.GameState` is registered globally using `{:global, __MODULE__}`
|
||||
- Only ONE instance runs across all nodes at any time
|
||||
- If the node running GameState crashes, Erlang automatically starts it on another node
|
||||
- All nodes can communicate with the GameState regardless of where it's running
|
||||
|
||||
### Client Failover
|
||||
|
||||
- Frontend maintains a list of all backend servers
|
||||
- Automatically reconnects to the next server if connection fails
|
||||
- Uses exponential backoff and retry logic
|
||||
- Displays current connection status
|
||||
|
||||
## Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Docker and Docker Compose
|
||||
- Or: Elixir 1.15+, Erlang 26+, Node.js 18+
|
||||
|
||||
### Running with Docker
|
||||
|
||||
1. Build and start all services:
|
||||
|
||||
```bash
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
This starts:
|
||||
- `phoenix1` on port 4001
|
||||
- `phoenix2` on port 4002
|
||||
- `phoenix3` on port 4003
|
||||
- `nginx` load balancer on port 4000
|
||||
|
||||
2. Open the client (in a separate terminal):
|
||||
|
||||
```bash
|
||||
cd client
|
||||
pnpm install
|
||||
pnpm dev
|
||||
```
|
||||
|
||||
3. Open http://localhost:5173 in your browser
|
||||
|
||||
### Running Locally (Development)
|
||||
|
||||
Terminal 1 - Backend Node 1:
|
||||
```bash
|
||||
cd backend
|
||||
mix deps.get
|
||||
export RELEASE_NODE=backend@127.0.0.1
|
||||
export RELEASE_COOKIE=mycookie
|
||||
export PORT=4001
|
||||
export CLUSTER_NODES="backend@127.0.0.1"
|
||||
iex --name backend@127.0.0.1 --cookie mycookie -S mix phx.server
|
||||
```
|
||||
|
||||
Terminal 2 - Backend Node 2:
|
||||
```bash
|
||||
cd backend
|
||||
export RELEASE_NODE=backend@127.0.0.2
|
||||
export RELEASE_COOKIE=mycookie
|
||||
export PORT=4002
|
||||
export CLUSTER_NODES="backend@127.0.0.1,backend@127.0.0.2"
|
||||
iex --name backend@127.0.0.2 --cookie mycookie -S mix phx.server
|
||||
```
|
||||
|
||||
Terminal 3 - Frontend:
|
||||
```bash
|
||||
cd client
|
||||
pnpm install
|
||||
pnpm dev
|
||||
```
|
||||
|
||||
## Testing Failover
|
||||
|
||||
### Test 1: Stop a node
|
||||
|
||||
```bash
|
||||
# Stop one container
|
||||
docker-compose stop phoenix1
|
||||
|
||||
# The game continues running on phoenix2 or phoenix3
|
||||
# Clients automatically reconnect to available nodes
|
||||
```
|
||||
|
||||
### Test 2: Kill the node running GameState
|
||||
|
||||
1. Find which node is running GameState:
|
||||
```bash
|
||||
docker-compose exec phoenix1 /app/bin/backend remote
|
||||
# In the IEx shell:
|
||||
:global.whereis_name(Backend.GameState)
|
||||
# This shows {pid, node_name}
|
||||
```
|
||||
|
||||
2. Stop that specific node:
|
||||
```bash
|
||||
docker-compose stop phoenix2 # or whichever node is running it
|
||||
```
|
||||
|
||||
3. The GameState automatically starts on another node
|
||||
4. All players remain in the game
|
||||
|
||||
### Test 3: Network partition
|
||||
|
||||
```bash
|
||||
# Disconnect a node from the network
|
||||
docker network disconnect websocket-testing_app_net phoenix3
|
||||
|
||||
# Reconnect it
|
||||
docker network connect websocket-testing_app_net phoenix3
|
||||
```
|
||||
|
||||
## Monitoring the Cluster
|
||||
|
||||
### Check connected nodes
|
||||
|
||||
```bash
|
||||
docker-compose exec phoenix1 /app/bin/backend remote
|
||||
```
|
||||
|
||||
In the IEx shell:
|
||||
```elixir
|
||||
# List all connected nodes
|
||||
Node.list()
|
||||
|
||||
# Check which node is running GameState
|
||||
:global.whereis_name(Backend.GameState)
|
||||
|
||||
# Get current game state
|
||||
Backend.GameState.get_state()
|
||||
|
||||
# Check registered global processes
|
||||
:global.registered_names()
|
||||
```
|
||||
|
||||
### View logs
|
||||
|
||||
```bash
|
||||
# All containers
|
||||
docker-compose logs -f
|
||||
|
||||
# Specific container
|
||||
docker-compose logs -f phoenix1
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
- `RELEASE_NODE`: Node name (e.g., `backend@phoenix1`)
|
||||
- `RELEASE_COOKIE`: Erlang cookie for cluster authentication
|
||||
- `CLUSTER_NODES`: Comma-separated list of nodes to connect to
|
||||
- `PORT`: HTTP port for Phoenix endpoint
|
||||
- `SECRET_KEY_BASE`: Phoenix secret key
|
||||
|
||||
### Scaling
|
||||
|
||||
To add more nodes, edit `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
phoenix4:
|
||||
# Same config as phoenix1-3, with unique:
|
||||
# - container_name: phoenix4
|
||||
# - hostname: phoenix4
|
||||
# - RELEASE_NODE: backend@phoenix4
|
||||
# - ports: "4004:4000"
|
||||
# - ipv4_address: 172.25.0.14
|
||||
```
|
||||
|
||||
Update `CLUSTER_NODES` in all services to include `backend@phoenix4`.
|
||||
|
||||
## How to Play
|
||||
|
||||
- Use **WASD** keys to move your player
|
||||
- Your player is shown in red, others in blue
|
||||
- The game state is shared across all nodes
|
||||
- Try killing nodes to see failover in action!
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Nodes not connecting
|
||||
|
||||
1. Check all nodes have the same `RELEASE_COOKIE`
|
||||
2. Verify EPMD is running: `docker-compose exec phoenix1 epmd -names`
|
||||
3. Check firewall allows ports 4369 (EPMD) and 9000-9100 (distributed Erlang)
|
||||
|
||||
### GameState not starting
|
||||
|
||||
1. Check logs: `docker-compose logs -f`
|
||||
2. Verify only one instance exists globally: `:global.registered_names()`
|
||||
3. Restart all nodes: `docker-compose restart`
|
||||
|
||||
### Frontend not connecting
|
||||
|
||||
1. Check nginx is running: `docker-compose ps nginx`
|
||||
2. Verify at least one Phoenix node is healthy
|
||||
3. Check browser console for connection errors
|
||||
4. Try connecting directly to a node: http://localhost:4001
|
||||
|
||||
## Production Considerations
|
||||
|
||||
- **Change the Erlang cookie**: Use a strong secret
|
||||
- **Use proper SSL/TLS**: Configure HTTPS for WebSocket connections
|
||||
- **Add health checks**: Monitor node health and GameState availability
|
||||
- **Persistent storage**: Add database for game state persistence
|
||||
- **Rate limiting**: Protect against abuse
|
||||
- **Monitoring**: Add Prometheus/Grafana for metrics
|
||||
- **Logging**: Centralize logs with ELK or similar
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user