working ha

This commit is contained in:
2026-02-23 13:00:16 -07:00
parent 3de4cdac62
commit ffb0d9075b
11 changed files with 685 additions and 38 deletions

236
DISTRIBUTED_SETUP.md Normal file
View File

@@ -0,0 +1,236 @@
# Distributed Phoenix WebSocket Game
This project demonstrates a distributed Phoenix application with automatic failover using native Erlang clustering.
## Architecture
- **3 Phoenix Nodes**: Running in Docker containers, forming a distributed Erlang cluster
- **Global Process Registry**: Uses `:global` to ensure only one `GameState` GenServer runs across the cluster
- **Automatic Failover**: If a node goes down, another node automatically takes over the GameState
- **Nginx Load Balancer**: Routes WebSocket connections to healthy nodes
- **Client Failover**: Frontend automatically switches to another server if connection is lost
## How It Works
### Distributed Erlang Clustering
- Each Phoenix container starts with a unique node name (e.g., `backend@phoenix1`)
- All nodes share the same Erlang cookie for authentication
- `Backend.Cluster` module automatically connects nodes on startup
- Nodes use EPMD (Erlang Port Mapper Daemon) for discovery
### Singleton Game State
- `Backend.GameState` is registered globally using `{:global, __MODULE__}`
- Only ONE instance runs across all nodes at any time
- If the node running GameState crashes, Erlang automatically starts it on another node
- All nodes can communicate with the GameState regardless of where it's running
### Client Failover
- Frontend maintains a list of all backend servers
- Automatically reconnects to the next server if connection fails
- Uses exponential backoff and retry logic
- Displays current connection status
## Setup
### Prerequisites
- Docker and Docker Compose
- Or: Elixir 1.15+, Erlang 26+, Node.js 18+
### Running with Docker
1. Build and start all services:
```bash
docker-compose up --build
```
This starts:
- `phoenix1` on port 4001
- `phoenix2` on port 4002
- `phoenix3` on port 4003
- `nginx` load balancer on port 4000
2. Open the client (in a separate terminal):
```bash
cd client
pnpm install
pnpm dev
```
3. Open http://localhost:5173 in your browser
### Running Locally (Development)
Terminal 1 - Backend Node 1:
```bash
cd backend
mix deps.get
export RELEASE_NODE=backend@127.0.0.1
export RELEASE_COOKIE=mycookie
export PORT=4001
export CLUSTER_NODES="backend@127.0.0.1"
iex --name backend@127.0.0.1 --cookie mycookie -S mix phx.server
```
Terminal 2 - Backend Node 2:
```bash
cd backend
export RELEASE_NODE=backend@127.0.0.2
export RELEASE_COOKIE=mycookie
export PORT=4002
export CLUSTER_NODES="backend@127.0.0.1,backend@127.0.0.2"
iex --name backend@127.0.0.2 --cookie mycookie -S mix phx.server
```
Terminal 3 - Frontend:
```bash
cd client
pnpm install
pnpm dev
```
## Testing Failover
### Test 1: Stop a node
```bash
# Stop one container
docker-compose stop phoenix1
# The game continues running on phoenix2 or phoenix3
# Clients automatically reconnect to available nodes
```
### Test 2: Kill the node running GameState
1. Find which node is running GameState:
```bash
docker-compose exec phoenix1 /app/bin/backend remote
# In the IEx shell:
:global.whereis_name(Backend.GameState)
# This shows {pid, node_name}
```
2. Stop that specific node:
```bash
docker-compose stop phoenix2 # or whichever node is running it
```
3. The GameState automatically starts on another node
4. All players remain in the game
### Test 3: Network partition
```bash
# Disconnect a node from the network
docker network disconnect websocket-testing_app_net phoenix3
# Reconnect it
docker network connect websocket-testing_app_net phoenix3
```
## Monitoring the Cluster
### Check connected nodes
```bash
docker-compose exec phoenix1 /app/bin/backend remote
```
In the IEx shell:
```elixir
# List all connected nodes
Node.list()
# Check which node is running GameState
:global.whereis_name(Backend.GameState)
# Get current game state
Backend.GameState.get_state()
# Check registered global processes
:global.registered_names()
```
### View logs
```bash
# All containers
docker-compose logs -f
# Specific container
docker-compose logs -f phoenix1
```
## Configuration
### Environment Variables
- `RELEASE_NODE`: Node name (e.g., `backend@phoenix1`)
- `RELEASE_COOKIE`: Erlang cookie for cluster authentication
- `CLUSTER_NODES`: Comma-separated list of nodes to connect to
- `PORT`: HTTP port for Phoenix endpoint
- `SECRET_KEY_BASE`: Phoenix secret key
### Scaling
To add more nodes, edit `docker-compose.yml`:
```yaml
phoenix4:
# Same config as phoenix1-3, with unique:
# - container_name: phoenix4
# - hostname: phoenix4
# - RELEASE_NODE: backend@phoenix4
# - ports: "4004:4000"
# - ipv4_address: 172.25.0.14
```
Update `CLUSTER_NODES` in all services to include `backend@phoenix4`.
## How to Play
- Use **WASD** keys to move your player
- Your player is shown in red, others in blue
- The game state is shared across all nodes
- Try killing nodes to see failover in action!
## Troubleshooting
### Nodes not connecting
1. Check all nodes have the same `RELEASE_COOKIE`
2. Verify EPMD is running: `docker-compose exec phoenix1 epmd -names`
3. Check firewall allows ports 4369 (EPMD) and 9000-9100 (distributed Erlang)
### GameState not starting
1. Check logs: `docker-compose logs -f`
2. Verify only one instance exists globally: `:global.registered_names()`
3. Restart all nodes: `docker-compose restart`
### Frontend not connecting
1. Check nginx is running: `docker-compose ps nginx`
2. Verify at least one Phoenix node is healthy
3. Check browser console for connection errors
4. Try connecting directly to a node: http://localhost:4001
## Production Considerations
- **Change the Erlang cookie**: Use a strong secret
- **Use proper SSL/TLS**: Configure HTTPS for WebSocket connections
- **Add health checks**: Monitor node health and GameState availability
- **Persistent storage**: Add database for game state persistence
- **Rate limiting**: Protect against abuse
- **Monitoring**: Add Prometheus/Grafana for metrics
- **Logging**: Centralize logs with ELK or similar
## License
MIT