Thus, as an army, we either all attack, or not. Let's try to keep things simple for now and do it all together to start. I think it would more interesting to do it on a per-unit basis, but more complex. We can either attack/not attack as a whole, or we could do it on a per-unit basis. If self.can_afford(CYBERNETICSCORE) and not self.already_pending(CYBERNETICSCORE):Īwait self.build(CYBERNETICSCORE, near=pylon)įinally, our attack method becomes quite simple: async def attack(self): If self.units(GATEWAY).ready.exists and not self.units(CYBERNETICSCORE): #print(eration / self.ITERATIONS_PER_MINUTE) First, let's modify the buildings method: async def offensive_force_buildings(self): Let's instead make *only* Void Ray units, and then we'll modify the attacking protocol. If self.units(UNIT).amount > aggressive_units and self.units(UNIT).amount > aggressive_units:Īwait self.do(s.attack(self.find_target(self.state)))Įlif self.units(UNIT).amount > aggressive_units:Īwait self.do(s.attack(random.choice(self.known_enemy_units))) Return random.choice(self.known_enemy_structures) Return random.choice(self.known_enemy_units)Įlif len(self.known_enemy_structures) > 0: If self.can_afford(VOIDRAY) and self.supply_left > 0: If self.can_afford(STALKER) and self.supply_left > 0:įor sg in self.units(STARGATE).ready.noqueue: If (len(self.units(NEXUS)) * 16) > len(self.units(PROBE)) and len(self.units(PROBE)) self.units(VOIDRAY).amount: This is just going to be my journey through trying things and sharing it with you.įrom sc2 import run_game, maps, Race, Difficultyįrom sc2.constants import NEXUS, PROBE, PYLON, ASSIMILATOR, GATEWAY, \ĬYBERNETICSCORE, STALKER, STARGATE, VOIDRAY Maybe Q-learning is the best choice, or maybe something else is, or some other form of structuring things than I will here. I do want to highlight that I really do not know the answer here. What I propose we do first is simply figure out attacking, or at least try. With an evolutionary algorithm, in the case of StarCraft II, you allow the winning algorithm to be a part of the gene pool (training data), and the loser is forgotten. What else reinforces some end result? Evolution does! What's the end result that is reinforced? Procreation. One of the major pitfalls of reinforcement learning is that people forget that they should be just simply reinforcing an end result, and nothing else, letting the algorithm figure out how to best get to that end result without human bias. The main idea of reinforcement learning is to reinforce good choices, through an end target result. What's the plan with the evolutionary algorithm then? Evolutionary algorithms are similar to reinforcement learning algorithms, so much so that I would argue that they are a form of reinforcement learning algorithms. While we could work within slices of the StarCraft II environment to rectify this issue.why? Other than for clickbait, I just can't justify Q-learning here. The idea of Q-learning is to distribute a reward or penalty across steps for a given pass through an environment. I know all the rage right now is Reinforcement Learning, specifically Q-learning, but I don't see where Q-learning is fundamentally going to work here better than other options, unless we dramatically simplify this challenge, or have hardware that vastly exceeds anything I could reasonably afford along with an extremely complex network. I believe the most applicable form of machine learning to apply in this case is an evolutionary application of deep learning. In this part, and the next few parts, we will be considering the addition of deep learning, but, first, we have to decide what form of deep learning to use! Hello and welcome to part 7 of our artificial intelligence with StarCraft II and Python programming tutorial series.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |