Documentation

MDP.Counterexample.A

Counterexample exhibiting ⨆ n, ⨅ 𝒮, EC c 𝒮 n < ⨅ 𝒮, ⨆ n, EC c 𝒮 n #

  s⋆-----+-----+·····+·····
  |      |     |     |
s₀,₀   s₀,₁  s₀,₂  s₀,ᵢ
  ⋮      |     |     ⋮
  ∞    s₁,₁  s₁,₂    ⋮
         ⋮     |     ⋮
         ∞   s₂,₂    ⋮
               ⋮   sᵢ,ᵢ
               ∞     ⋮
                     ∞

Setup: (instance)

Now consider the two order of optimization ⨆ n, ⨅ 𝒮, EC c 𝒮 n and ⨅ 𝒮, ⨆ n, EC c 𝒮 n.

In the first case we the scheduler gets to make its choice based on n, and thus can choose a an action where the depth will not reach a state like sᵢ,ᵢ with cost. Thus the expected cost for the ⨆⨅ order will be 0.

In the second case we consider first a scheduler and then a depth. That means that we can pick a depth, say i+1, where the action the scheduler picked in s⋆ was i. In this case we will always be able to pick a depth that reaches a state with cost. Thus the expected cost for the ⨅⨆ order will be .

This leads to iSup_iInf_EC_lt_iInf_iSup_EC.

Additionally we can show the same for MDP.lfp_Φ giving us iSup_iInf_EC_lt_lfp_Φ.

Instances For
    Instances For
      theorem MDP.Counterexample.A.step_iff (a✝ : State) (a✝¹ : ) (a✝² : ENNReal) (a✝³ : State) :
      Step a✝ a✝¹ a✝² a✝³ a✝ = State.init a✝² = 1 a✝³ = State.node 0 a✝¹ ∃ (i : ) (j : ), a✝ = State.node i j a✝¹ = 0 a✝² = 1 a✝³ = State.node (i + 1) j
      @[simp]
      theorem MDP.Counterexample.A.init_iff {α : } {p : ENNReal} {s' : State} :
      Step State.init α p s' p = 1 s' = State.node 0 α
      @[simp]
      theorem MDP.Counterexample.A.node_iff {i j α : } {p : ENNReal} {s' : State} :
      Step (State.node i j) α p s' α = 0 p = 1 s' = State.node (i + 1) j
      @[simp]
      @[simp]
      theorem MDP.Counterexample.A.tsum_p {c : State} {α : } {c' : State} :
      ∑' (p : { p : ENNReal // Step c α p c' }), p = ∑' (p : ENNReal), if Step c α p c' then p else 0
      @[simp]
      theorem MDP.Counterexample.A.𝒮_node {𝒮 : 𝔖[M]} {i j : } :
      𝒮 {State.node i j} = 0
      theorem MDP.Counterexample.A.EC_node_i_le_j_eq_top {𝒮 : 𝔖[M]} {j i n : } (h : j i) :
      EC M.cost 𝒮 n (State.node i j) = if n = 0 then 0 else
      @[simp]
      theorem MDP.Counterexample.A.EC_step {𝒮 : 𝔖[M]} {n i j : } :
      EC M.cost 𝒮 (n + 2) (State.node i j) = EC M.cost 𝒮 (n + 1) (State.node (i + 1) j)
      @[simp]
      theorem MDP.Counterexample.A.EC_node_i_j_n_eq_i_j_add_n {𝒮 : 𝔖[M]} {n i j : } :
      EC M.cost 𝒮 (n + 1) (State.node i j) = EC M.cost 𝒮 1 (State.node (i + n) j)
      @[simp]
      theorem MDP.Counterexample.A.EC_init_eq_EC_node {𝒮 : 𝔖[M]} {n : } :
      EC M.cost 𝒮 (n + 2) State.init = EC M.cost 𝒮 (n + 1) (State.node 0 (𝒮 {State.init}))
      @[simp]
      theorem MDP.Counterexample.A.iInf_iSup_EC_eq_0 :
      ⨅ (𝒮 : 𝔖[M]), ⨆ (n : ), EC M.cost 𝒮 n State.init =
      @[simp]
      theorem MDP.Counterexample.A.iSup_iInf_EC_eq_top :
      ⨆ (n : ), ⨅ (𝒮 : 𝔖[M]), EC M.cost 𝒮 n State.init = 0
      @[simp]
      theorem MDP.Counterexample.A.iInf_iSup_ECℒ_eq_0 :
      ⨅ ( : 𝔏[M]), ⨆ (n : ), EC M.cost (↑) n State.init =
      @[simp]
      theorem MDP.Counterexample.A.iSup_iInf_ECℒ_eq_top :
      ⨆ (n : ), ⨅ ( : 𝔏[M]), EC M.cost (↑) n State.init = 0
      theorem MDP.Counterexample.A.iSup_iInf_EC_lt_iInf_iSup_EC :
      ⨆ (n : ), ⨅ (𝒮 : 𝔖[M]), EC M.cost 𝒮 n State.init < ⨅ (𝒮 : 𝔖[M]), ⨆ (n : ), EC M.cost 𝒮 n State.init
      theorem MDP.Counterexample.A.iSup_iInf_ECℒ_lt_iInf_iSup_ECℒ :
      ⨆ (n : ), ⨅ ( : 𝔏[M]), EC M.cost (↑) n State.init < ⨅ ( : 𝔏[M]), ⨆ (n : ), EC M.cost (↑) n State.init