Can you explain how do you deal with illegal actions? #6

euwern · 2018-07-23T05:23:30Z

In your paper, you mentioned that the action scorer module spits out two outputs (one action ("go", "eat"), and one object("east", "apple"). I wonder how does your architecture deals with illegal action such as the following:
given a state s, the possible actions are:
a1: eat apple
a2: go east
However the action scorer will score all possible word in the action ("go", "eat") and objects ("east", "apple"). which results in 4 possible actions
a1: eat apple (legal action) --> score: 0.9
a2: go east (legal action) --> score: 0.08
a3: eat east (illegal action) --> score: 0.01
a4: go apple (illegal action) --> score: 0.01

In such scenario how does your architecture deals with illegal actions? do you just look up the table for only legal actions?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can you explain how do you deal with illegal actions? #6

Can you explain how do you deal with illegal actions? #6

euwern commented Jul 23, 2018

Can you explain how do you deal with illegal actions? #6

Can you explain how do you deal with illegal actions? #6

Comments

euwern commented Jul 23, 2018