è§£éMLflow SQLiteæ¥è¯¢ | AIçæåç¿»è¯
é®é¢ï¼è§£éç¨äºæ£æ¥ MLflow SQLite æ°æ®åºçè¿äº SQL æ¥è¯¢ã
çæ¡ï¼
è¿äºæ¥è¯¢ç´æ¥ä½¿ç¨ SQLite 读å MLflow è·è¸ªæ°æ®åº (mlflow.db)ã
1. 计ç®è¿è¡æ¶é¿
(r.end_time - r.start_time)/1000/60 as duration_min
FROM runs r
JOIN experiments e ON r.experiment_id = e.experiment_id
ORDER BY r.start_time;
ä½ç¨
MLflow 以Unix 纪å 以æ¥çæ¯«ç§æ°å卿¶é´æ³ã
示ä¾ï¼
start_time = 1749500000000
end_time = 1749503600000
å·®å¼ï¼
3600000 毫ç§
转æ¢ä¸ºåéï¼
(end_time - start_time)
/1000 -- æ¯«ç§ â ç§
/60 -- ç§ â åé
ç»æï¼
duration_min = 60
å«ä¹
æ¾ç¤ºæ¯æ¬¡è®ç»è¿è¡çèæ¶ã
示ä¾ï¼
éç¨äºï¼
- è®ç»æ¶é´æ¯è¾
- æ£æµå¤±è´¥è¿è¡
- æ§è½åºåæµè¯
2. æ¾ç¤ºææåæ°
SELECT run_uuid, key, value
FROM params
ORDER BY run_uuid, key;
MLflow åæ°è¡¨
åå¨è¶ åæ°ã
示ä¾ï¼
| run_uuid | key | value |
|---|---|---|
| abc123 | depth | 24 |
| abc123 | lr | 0.001 |
| abc123 | batch_size | 32 |
å«ä¹ï¼
mlflow.log_param("depth", 24)
mlflow.log_param("lr", 0.001)
ä¼å¨è¯¥è¡¨ä¸å建è¡ã
å ¸åç NanoChat åæ°
ä½ å¯è½ä¼çå°ï¼
depth=24
window_pattern=L
device_batch_size=32
max_seq_len=2048
num_iterations=5
3. æ¾ç¤ºææ åå²
SELECT
m.run_uuid,
substr(r.run_uuid,1,8) as run_short,
t.value as run_name,
m.key,
m.value,
m.step
FROM metrics m
JOIN runs r ON m.run_uuid = r.run_uuid
LEFT JOIN tags t
ON t.run_uuid = r.run_uuid
AND t.key = 'mlflow.runName'
ORDER BY r.run_uuid, m.key, m.step
LIMIT 60;
æ¶åçè¡¨æ ¼
metrics
å卿¯ä¸ªè®°å½çææ ç¹ã
示ä¾ï¼
| run_uuid | key | value | step |
|---|---|---|---|
| abc123 | loss | 3.2 | 0 |
| abc123 | loss | 2.8 | 1 |
| abc123 | loss | 2.4 | 2 |
runs
å å«è¿è¡å æ°æ®ã
tags
åå¨é¢å¤ä¿¡æ¯ã
示ä¾ï¼
mlflow.runName = "depth24_batch32"
æ¥è¯¢çè¾åº
示ä¾è¾åºï¼
| run_name | key | value | step |
|---|---|---|---|
| depth24 | loss | 3.2 | 0 |
| depth24 | loss | 2.8 | 1 |
| depth24 | loss | 2.4 | 2 |
è¿æ¾ç¤ºäºå®æ´çææ åå²ã
éç¨äºç»å¶è®ç»æ²çº¿ã
4. ä» æ¾ç¤ºææ°ææ
SELECT
t.value as run_name,
lm.key,
lm.value as latest_value,
lm.step
FROM latest_metrics lm
JOIN runs r ON lm.run_uuid = r.run_uuid
LEFT JOIN tags t
ON t.run_uuid = r.run_uuid
AND t.key = 'mlflow.runName'
ORDER BY run_name, lm.key;
latest_metrics 表
MLflow ç»´æ¤ä¸ä¸ªç¹æ®è¡¨ï¼
latest_metrics
ä» å 嫿¯ä¸ªææ çææ°å¼ã
ä¸ååå¨ï¼
loss step 0
loss step 1
loss step 2
loss step 3
èæ¯ä¿çï¼
loss step 3
示ä¾
ææ åå²ï¼
| step | loss |
|---|---|
| 0 | 4.0 |
| 100 | 3.1 |
| 200 | 2.7 |
| 300 | 2.4 |
latest_metrics åå¨ï¼
| key | value | step |
|---|---|---|
| loss | 2.4 | 300 |
ä¸ºä½æç¨
å¿«éæ¯è¾è¿è¡ï¼
æ éæ«ææ°ç¾ä¸è¡ææ æ°æ®ã
MLflow æ°æ®åºç»æï¼ç®åçï¼
experiments
â
âââ runs
â
âââ params
âââ metrics
âââ latest_metrics
âââ tags
å ³ç³»
Experiment
âââ Run
âââ Parameters
âââ Metrics History
âââ Latest Metrics
âââ Tags
å¯¹äº NanoChat è®ç»è¿è¡ï¼
experiments= è®ç»é¡¹ç®runs= å个è®ç»å°è¯params= è¶ åæ°metrics= éæ¶é´ååç loss/perplexity æ²çº¿latest_metrics= æç»ç loss/perplexity å¼tags= è¿è¡åç§°åå æ°æ®
å¯å¨ MLflow UI åçä¸ä¸æ¥é常æ¯ï¼
mlflow ui --backend-store-uri sqlite:///mlflow.db
ç¶åæå¼ï¼
http://127.0.0.1:5000
å¯è§åæ¯è¾ NanoChat è¿è¡ãæ£æ¥åæ°å¹¶æ¥ç loss æ²çº¿ï¼èæ éæå¨æ¥è¯¢ SQLiteã
