304 lines
9.0 KiB
Markdown
304 lines
9.0 KiB
Markdown
# Portfolio & AlpenQueue Playground
|
||
|
||
[](https://maxtheweb.com)
|
||
[](https://go.dev)
|
||
[](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events)
|
||
[](https://maxtheweb.com)
|
||
|
||
> Transformed from postcard to playground – an interactive portfolio that doubles as a live demo for AlpenQueue, featuring real-time web scraping with Server-Sent Events.
|
||
|
||
## Live Demo
|
||
|
||
**Try it now:** [https://maxtheweb.com](https://maxtheweb.com)
|
||
|
||
Submit a scraping job and watch results stream in real-time. Pre-filled with HackerNews for instant gratification:
|
||
- **URL:** `https://news.ycombinator.com`
|
||
- **Selector:** `.athing .titleline`
|
||
|
||
## Features
|
||
|
||
- **Interactive Web Scraping Demo** – Submit jobs to AlpenQueue, see results instantly
|
||
- **Real-Time Updates** – Server-Sent Events stream results as they complete
|
||
- **Live Statistics** – Track jobs queued and results received in your session
|
||
- **Terminal Aesthetic** – Dark theme with mountain-green (`#4a9e5f`) accents
|
||
- **Auto-Reconnection** – SSE connection automatically recovers from network issues
|
||
- **CSS Selector Support** – Extract any content using standard CSS selectors
|
||
|
||
## Architecture
|
||
|
||
```
|
||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||
│ Browser │────▶│ Portfolio │────▶│ AlpenQueue │
|
||
│ │ │ (Nginx) │ │ API │
|
||
└─────────────┘ └─────────────┘ └─────────────┘
|
||
▲ │
|
||
│ │
|
||
│ ┌─────────────┐ │
|
||
└────SSE─────│ Webhook-SSE │◀──Webhook──┘
|
||
│ (:8081) │
|
||
└─────────────┘
|
||
```
|
||
|
||
## Technology Stack
|
||
|
||
- **Frontend**: HTML5, CSS3, Vanilla JavaScript (no frameworks!)
|
||
- **Backend**: Go 1.25 (webhook-SSE service)
|
||
- **Integration**: [AlpenQueue](https://git.maxtheweb.com/maxtheweb/AlpenQueue) API
|
||
- **Infrastructure**: Nginx, Systemd, Let's Encrypt SSL
|
||
- **Hosting**: Self-hosted on €5/month Hetzner VPS
|
||
|
||
## Project Structure
|
||
|
||
```
|
||
portfolio/
|
||
├── index.html # Main interactive playground
|
||
├── styles.css # Terminal-style dark theme
|
||
├── script.js # SSE client & form handler
|
||
├── webhook-sse/ # Go SSE broker service
|
||
│ ├── main.go # Webhook receiver & SSE broadcaster
|
||
│ └── go.mod # Go module (1.25)
|
||
└── README.md # You are here!
|
||
```
|
||
|
||
## Installation & Deployment
|
||
|
||
### Local Development
|
||
|
||
1. **Clone the repository**
|
||
```bash
|
||
git clone git@git.maxtheweb.com:maxtheweb/Portfolio.git
|
||
cd Portfolio
|
||
```
|
||
|
||
2. **Run the webhook-SSE service**
|
||
```bash
|
||
cd webhook-sse
|
||
go run main.go
|
||
# Server listening on :8081
|
||
```
|
||
|
||
3. **Serve the frontend**
|
||
```bash
|
||
# From project root
|
||
python3 -m http.server 8080
|
||
# Visit http://localhost:8080
|
||
```
|
||
|
||
4. **Configure AlpenQueue endpoint** (optional)
|
||
- Update `script.js` to point to your AlpenQueue instance
|
||
- Or use the public demo at `https://alpenqueue.maxtheweb.com`
|
||
|
||
### Production Deployment
|
||
|
||
1. **Nginx Configuration** (`/etc/nginx/conf.d/portfolio.conf`)
|
||
```nginx
|
||
server {
|
||
server_name maxtheweb.com www.maxtheweb.com;
|
||
root /var/www/portfolio;
|
||
|
||
location / {
|
||
try_files $uri $uri/ =404;
|
||
}
|
||
|
||
location /events {
|
||
proxy_pass http://localhost:8081/events;
|
||
proxy_set_header Connection "";
|
||
proxy_http_version 1.1;
|
||
proxy_buffering off;
|
||
proxy_cache off;
|
||
}
|
||
|
||
location /webhook {
|
||
proxy_pass http://localhost:8081/webhook;
|
||
}
|
||
}
|
||
```
|
||
|
||
2. **Systemd Service** (`/etc/systemd/system/webhook-sse.service`)
|
||
```ini
|
||
[Unit]
|
||
Description=Webhook SSE Service for Portfolio
|
||
After=network.target
|
||
|
||
[Service]
|
||
Type=simple
|
||
User=webhost
|
||
Group=webhost
|
||
WorkingDirectory=/var/www/portfolio/webhook-sse
|
||
ExecStart=/var/www/portfolio/webhook-sse/webhook-sse
|
||
Restart=always
|
||
RestartSec=10
|
||
|
||
# Security hardening
|
||
NoNewPrivileges=true
|
||
PrivateTmp=true
|
||
ProtectHome=true
|
||
ProtectSystem=strict
|
||
ReadWritePaths=/var/www/portfolio/webhook-sse
|
||
|
||
[Install]
|
||
WantedBy=multi-user.target
|
||
```
|
||
|
||
3. **Enable and start the service**
|
||
```bash
|
||
sudo systemctl daemon-reload
|
||
sudo systemctl enable webhook-sse
|
||
sudo systemctl start webhook-sse
|
||
```
|
||
|
||
4. **SSL with Let's Encrypt**
|
||
```bash
|
||
sudo certbot --nginx -d maxtheweb.com -d www.maxtheweb.com
|
||
```
|
||
|
||
## How It Works
|
||
|
||
1. **User submits a scraping job** via the web form
|
||
2. **Portfolio sends POST request** to AlpenQueue API
|
||
3. **AlpenQueue queues the job** and returns job ID
|
||
4. **AlpenQueue processes the job** (fetches URL, extracts content)
|
||
5. **AlpenQueue sends webhook** with results to webhook-SSE service
|
||
6. **Webhook-SSE broadcasts** result to all connected SSE clients
|
||
7. **Browser receives event** and displays result with animation
|
||
8. **Statistics update** to reflect new job and result
|
||
|
||
## API Integration
|
||
|
||
### AlpenQueue Jobs API
|
||
|
||
**Request:**
|
||
```javascript
|
||
POST https://alpenqueue.maxtheweb.com/jobs
|
||
Content-Type: application/json
|
||
|
||
{
|
||
"url": "https://example.com",
|
||
"selector": ".content",
|
||
"webhook_url": "https://maxtheweb.com/webhook"
|
||
}
|
||
```
|
||
|
||
**Response:**
|
||
```
|
||
Job 7 created
|
||
```
|
||
|
||
### SSE Event Format
|
||
|
||
**Connection:**
|
||
```javascript
|
||
const eventSource = new EventSource('https://maxtheweb.com/events');
|
||
```
|
||
|
||
**Event Data:**
|
||
```javascript
|
||
event: job-result
|
||
data: {
|
||
"status": "ok", // or "error", "blocked"
|
||
"took": "0.3s", // execution time
|
||
"url": "https://example.com",
|
||
"content": "Extracted content here..."
|
||
}
|
||
```
|
||
|
||
**Keep-Alive:**
|
||
```javascript
|
||
event: ping
|
||
data: keep-alive
|
||
```
|
||
|
||
## Configuration
|
||
|
||
### Webhook-SSE Service
|
||
|
||
The service runs on port 8081 by default. No configuration file needed – it's designed to be simple and self-contained.
|
||
|
||
### CORS Headers
|
||
|
||
- **SSE Endpoint** (`/events`): Allows `https://maxtheweb.com`
|
||
- **Webhook Endpoint** (`/webhook`): Allows `https://alpenqueue.maxtheweb.com`
|
||
|
||
### Frontend Configuration
|
||
|
||
Edit `script.js` to modify:
|
||
- AlpenQueue API endpoint (line ~50)
|
||
- SSE endpoint (line ~150)
|
||
- Result display limit (default: 10)
|
||
- Reconnection delay (default: 5000ms)
|
||
|
||
## Development
|
||
|
||
### Running Tests
|
||
|
||
```bash
|
||
# Test SSE connection
|
||
curl -N https://maxtheweb.com/events
|
||
|
||
# Test webhook endpoint
|
||
curl -X POST https://maxtheweb.com/webhook \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"status":"ok","took":"0.5s","url":"test","content":"Hello!"}'
|
||
```
|
||
|
||
### Building Webhook-SSE
|
||
|
||
```bash
|
||
cd webhook-sse
|
||
go build -o webhook-sse main.go
|
||
```
|
||
|
||
### Code Style
|
||
|
||
- **JavaScript**: Vanilla JS, no frameworks, clear function names
|
||
- **Go**: Standard library only, channel-based concurrency
|
||
- **CSS**: BEM-like naming, CSS variables for theming
|
||
|
||
## Security Considerations
|
||
|
||
- **CORS**: Properly configured for each endpoint
|
||
- **Service User**: Runs as non-root `webhost` user
|
||
- **Systemd Hardening**: PrivateTmp, ProtectSystem, NoNewPrivileges
|
||
- **Input Validation**: URL and selector validated client-side
|
||
- **Rate Limiting**: Consider adding nginx rate limits for production
|
||
|
||
## Performance
|
||
|
||
- **Buffered Channels**: SSE broker uses 100-message buffer
|
||
- **Result Limiting**: Frontend displays last 10 results
|
||
- **Non-Blocking Broadcast**: Slow clients don't affect others
|
||
- **Efficient DOM Updates**: Results added with minimal reflow
|
||
- **Auto-Reconnection**: 5-second backoff prevents thundering herd
|
||
|
||
## Contributing
|
||
|
||
Issues and pull requests welcome at [git.maxtheweb.com/maxtheweb/Portfolio](https://git.maxtheweb.com/maxtheweb/Portfolio)
|
||
|
||
### Development Workflow
|
||
|
||
1. Fork the repository
|
||
2. Create a feature branch
|
||
3. Make your changes
|
||
4. Test locally with both services running
|
||
5. Submit a pull request
|
||
|
||
## License
|
||
|
||
MIT License – See [LICENSE](LICENSE) file for details.
|
||
|
||
## Credits
|
||
|
||
- **Built by**: Max @ [maxtheweb.com](https://maxtheweb.com) | [max@maxtheweb.com](mailto:max@maxtheweb.com)
|
||
- **Powered by**: [AlpenQueue](https://git.maxtheweb.com/maxtheweb/AlpenQueue) – A lightweight task queue in Go
|
||
- **Hosted on**: €5/month Hetzner VPS (self-hosted with pride!)
|
||
- **Inspired by**: The transformation from "postcard to playground"
|
||
|
||
---
|
||
|
||
*"What started as a simple portfolio became an interactive demonstration of real-time web technologies. Sometimes the best portfolios don't just tell – they show."*
|
||
|
||
## Links
|
||
|
||
- **Live Demo**: [https://maxtheweb.com](https://maxtheweb.com)
|
||
- **AlpenQueue**: [https://git.maxtheweb.com/maxtheweb/AlpenQueue](https://git.maxtheweb.com/maxtheweb/AlpenQueue)
|
||
- **AlpenQueue API**: [https://alpenqueue.maxtheweb.com](https://alpenqueue.maxtheweb.com) |