//
sign in
Profile
by @danabra.mov
Profile
by @dansshadow.bsky.social
Profile
by @jimpick.com
AviHandle
by @danabra.mov
AviHandle
by @dansshadow.bsky.social
AviHandle
by @katherine.computer
EventsList
by @katherine.computer
ProfileHeader
by @dansshadow.bsky.social
ProfileHeader
by @danabra.mov
ProfileMedia
by @danabra.mov
ProfilePlays
by @danabra.mov
ProfilePosts
by @danabra.mov
ProfilePosts
by @dansshadow.bsky.social
ProfileReplies
by @danabra.mov
Record
by @atsui.org
Skircle
by @danabra.mov
StreamPlacePlaylist
by @katherine.computer
+ new component
ProfileReplies









Loading...
Or, you know, open positions and pay them…
I don’t know why it took my years to realize that airlines let you just bring lunch on a plane! Why did I go hungry for so long in domestic flights???
(Yes, I promised myself I would try to use more hype-y Gen Z style emojis 👨‍🦳)
💡The insight? Regularizing the entropy of the policy carefully, we can guarantee consistent policies across different independent retraining runs, without access to information of the other runs! For details check out @marcelhussing.bsky.social's post and the linked paper.
👀 Check out @marcelhussing.bsky.social and @liv-daliberti.bsky.social's amazing new work on behavioral consistency! ❓The challenge? Retraining an RL agent might give you a completely different policy than before! This makes everything harder, as we never quite know whether we simply got unlucky 🤔
Got a great TMLR paper but missed the RLC deadline? Following last year’s success, @RL_Conference is back with a Journal-to-Conference track! Accepted TMLR papers within scope are invited to submit for consideration. Please submit here: docs.google.com/forms/d/e/1F...