237 lines
5.8 KiB
Markdown
237 lines
5.8 KiB
Markdown
# Distributed Phoenix WebSocket Game
|
|
|
|
This project demonstrates a distributed Phoenix application with automatic failover using native Erlang clustering.
|
|
|
|
## Architecture
|
|
|
|
- **3 Phoenix Nodes**: Running in Docker containers, forming a distributed Erlang cluster
|
|
- **Global Process Registry**: Uses `:global` to ensure only one `GameState` GenServer runs across the cluster
|
|
- **Automatic Failover**: If a node goes down, another node automatically takes over the GameState
|
|
- **Nginx Load Balancer**: Routes WebSocket connections to healthy nodes
|
|
- **Client Failover**: Frontend automatically switches to another server if connection is lost
|
|
|
|
## How It Works
|
|
|
|
### Distributed Erlang Clustering
|
|
|
|
- Each Phoenix container starts with a unique node name (e.g., `backend@phoenix1`)
|
|
- All nodes share the same Erlang cookie for authentication
|
|
- `Backend.Cluster` module automatically connects nodes on startup
|
|
- Nodes use EPMD (Erlang Port Mapper Daemon) for discovery
|
|
|
|
### Singleton Game State
|
|
|
|
- `Backend.GameState` is registered globally using `{:global, __MODULE__}`
|
|
- Only ONE instance runs across all nodes at any time
|
|
- If the node running GameState crashes, Erlang automatically starts it on another node
|
|
- All nodes can communicate with the GameState regardless of where it's running
|
|
|
|
### Client Failover
|
|
|
|
- Frontend maintains a list of all backend servers
|
|
- Automatically reconnects to the next server if connection fails
|
|
- Uses exponential backoff and retry logic
|
|
- Displays current connection status
|
|
|
|
## Setup
|
|
|
|
### Prerequisites
|
|
|
|
- Docker and Docker Compose
|
|
- Or: Elixir 1.15+, Erlang 26+, Node.js 18+
|
|
|
|
### Running with Docker
|
|
|
|
1. Build and start all services:
|
|
|
|
```bash
|
|
docker-compose up --build
|
|
```
|
|
|
|
This starts:
|
|
- `phoenix1` on port 4001
|
|
- `phoenix2` on port 4002
|
|
- `phoenix3` on port 4003
|
|
- `nginx` load balancer on port 4000
|
|
|
|
2. Open the client (in a separate terminal):
|
|
|
|
```bash
|
|
cd client
|
|
pnpm install
|
|
pnpm dev
|
|
```
|
|
|
|
3. Open http://localhost:5173 in your browser
|
|
|
|
### Running Locally (Development)
|
|
|
|
Terminal 1 - Backend Node 1:
|
|
```bash
|
|
cd backend
|
|
mix deps.get
|
|
export RELEASE_NODE=backend@127.0.0.1
|
|
export RELEASE_COOKIE=mycookie
|
|
export PORT=4001
|
|
export CLUSTER_NODES="backend@127.0.0.1"
|
|
iex --name backend@127.0.0.1 --cookie mycookie -S mix phx.server
|
|
```
|
|
|
|
Terminal 2 - Backend Node 2:
|
|
```bash
|
|
cd backend
|
|
export RELEASE_NODE=backend@127.0.0.2
|
|
export RELEASE_COOKIE=mycookie
|
|
export PORT=4002
|
|
export CLUSTER_NODES="backend@127.0.0.1,backend@127.0.0.2"
|
|
iex --name backend@127.0.0.2 --cookie mycookie -S mix phx.server
|
|
```
|
|
|
|
Terminal 3 - Frontend:
|
|
```bash
|
|
cd client
|
|
pnpm install
|
|
pnpm dev
|
|
```
|
|
|
|
## Testing Failover
|
|
|
|
### Test 1: Stop a node
|
|
|
|
```bash
|
|
# Stop one container
|
|
docker-compose stop phoenix1
|
|
|
|
# The game continues running on phoenix2 or phoenix3
|
|
# Clients automatically reconnect to available nodes
|
|
```
|
|
|
|
### Test 2: Kill the node running GameState
|
|
|
|
1. Find which node is running GameState:
|
|
```bash
|
|
docker-compose exec phoenix1 /app/bin/backend remote
|
|
# In the IEx shell:
|
|
:global.whereis_name(Backend.GameState)
|
|
# This shows {pid, node_name}
|
|
```
|
|
|
|
2. Stop that specific node:
|
|
```bash
|
|
docker-compose stop phoenix2 # or whichever node is running it
|
|
```
|
|
|
|
3. The GameState automatically starts on another node
|
|
4. All players remain in the game
|
|
|
|
### Test 3: Network partition
|
|
|
|
```bash
|
|
# Disconnect a node from the network
|
|
docker network disconnect websocket-testing_app_net phoenix3
|
|
|
|
# Reconnect it
|
|
docker network connect websocket-testing_app_net phoenix3
|
|
```
|
|
|
|
## Monitoring the Cluster
|
|
|
|
### Check connected nodes
|
|
|
|
```bash
|
|
docker-compose exec phoenix1 /app/bin/backend remote
|
|
```
|
|
|
|
In the IEx shell:
|
|
```elixir
|
|
# List all connected nodes
|
|
Node.list()
|
|
|
|
# Check which node is running GameState
|
|
:global.whereis_name(Backend.GameState)
|
|
|
|
# Get current game state
|
|
Backend.GameState.get_state()
|
|
|
|
# Check registered global processes
|
|
:global.registered_names()
|
|
```
|
|
|
|
### View logs
|
|
|
|
```bash
|
|
# All containers
|
|
docker-compose logs -f
|
|
|
|
# Specific container
|
|
docker-compose logs -f phoenix1
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
- `RELEASE_NODE`: Node name (e.g., `backend@phoenix1`)
|
|
- `RELEASE_COOKIE`: Erlang cookie for cluster authentication
|
|
- `CLUSTER_NODES`: Comma-separated list of nodes to connect to
|
|
- `PORT`: HTTP port for Phoenix endpoint
|
|
- `SECRET_KEY_BASE`: Phoenix secret key
|
|
|
|
### Scaling
|
|
|
|
To add more nodes, edit `docker-compose.yml`:
|
|
|
|
```yaml
|
|
phoenix4:
|
|
# Same config as phoenix1-3, with unique:
|
|
# - container_name: phoenix4
|
|
# - hostname: phoenix4
|
|
# - RELEASE_NODE: backend@phoenix4
|
|
# - ports: "4004:4000"
|
|
# - ipv4_address: 172.25.0.14
|
|
```
|
|
|
|
Update `CLUSTER_NODES` in all services to include `backend@phoenix4`.
|
|
|
|
## How to Play
|
|
|
|
- Use **WASD** keys to move your player
|
|
- Your player is shown in red, others in blue
|
|
- The game state is shared across all nodes
|
|
- Try killing nodes to see failover in action!
|
|
|
|
## Troubleshooting
|
|
|
|
### Nodes not connecting
|
|
|
|
1. Check all nodes have the same `RELEASE_COOKIE`
|
|
2. Verify EPMD is running: `docker-compose exec phoenix1 epmd -names`
|
|
3. Check firewall allows ports 4369 (EPMD) and 9000-9100 (distributed Erlang)
|
|
|
|
### GameState not starting
|
|
|
|
1. Check logs: `docker-compose logs -f`
|
|
2. Verify only one instance exists globally: `:global.registered_names()`
|
|
3. Restart all nodes: `docker-compose restart`
|
|
|
|
### Frontend not connecting
|
|
|
|
1. Check nginx is running: `docker-compose ps nginx`
|
|
2. Verify at least one Phoenix node is healthy
|
|
3. Check browser console for connection errors
|
|
4. Try connecting directly to a node: http://localhost:4001
|
|
|
|
## Production Considerations
|
|
|
|
- **Change the Erlang cookie**: Use a strong secret
|
|
- **Use proper SSL/TLS**: Configure HTTPS for WebSocket connections
|
|
- **Add health checks**: Monitor node health and GameState availability
|
|
- **Persistent storage**: Add database for game state persistence
|
|
- **Rate limiting**: Protect against abuse
|
|
- **Monitoring**: Add Prometheus/Grafana for metrics
|
|
- **Logging**: Centralize logs with ELK or similar
|
|
|
|
## License
|
|
|
|
MIT
|