Hacker Newsnew | past | comments | ask | show | jobs | submit | jwrae's commentslogin

Hi, I have open-sourced the tensorflow model in the sonnet package:

https://github.com/deepmind/sonnet/blob/cd5b5fa48e15e4d020f7...

Will look into releasing some pre-trained weights, but the model trained on PG-19 is not really intended to be a general purpose language generation model so I'd prefer if it not be picked up for downstream applications like gpt2 & bert. The text from these old books contains some historical bias etc.

Hopefully the model can be useful for people wanting to model long sequences generally, or build on other compressive memory ideas.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: