Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Evaluating AI agents: Real-world lessons from building agentic systems at Amazon
(
amazon.com
)
3 points
by
bpedro
16 days ago
|
hide
|
past
|
favorite
|
1 comment
lumpilumpi
16 days ago
[–]
I get the justification but I found it hard to understand how the actual evaluation at each step is carried out. For example, is there any calibration to some human gold standard involved or is the AI evaluating the AI without calibration/oversight?
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: