Jimminy Crickets : ML-Agents

COMPLETED

Article

Jimminy Crickets : ML-Agents

Updated 6 years ago

MoniGarr

Contractor (android, unity3d, xr, iot) - Owner

How to train Jimminy Crickets to find food in a Unity 3D ML-Agents project.

Training Environment: https://github.com/monigarr/unityconnect_challenge_mlagents.  
Video of my cricket agent after it was trained about 5 minutes. Poor thing is jumpy but it does find the food and the food goes in the belly. 
 

I ran the training in the PPO Jupyter notebook (PPO_JimminyCrickets.ipynb in the github repo): 
 
Set-up: Unity project with a plane, One Academy, One Brain and One Agent (cricket)
Goal: The cricket needs to find some food (sphere)
Agents: The environment contains one cricket linked to a single brain.
Agent Reward Function:
+1.0 when agent reaches goal (eat the food).
-1.0 when agent does not reach goal (eat the food).
Brains: One brain with the following state/action space.
 

State space: (Continuous) 2 variables corresponding to movement (left and right).
Action space: (Discrete) Size of 2, corresponding to location number (left and right).
As a newbie to the Unity3D ML-Agents, I first learned how to train my agents in a 3D environment with a Reinforcement Learning - Proximal Policy Optimization (PPO) Algorithm with a pre-made Jupyter notebook. I made edits to match my own development environment and experimented with all the machine language examples provided by Unity3D (3DBall, Area, Basic, Crawler, GridWorld, Reacher, Tennis).  While learning, I discovered I can also train my agents with neuroevolution and many other machine learning methods with the Python API!  I plan to experiment with those in the future after I complete my ML-Agents Challenge project. 
1. Create New Learning Environment
The first step was to setup my learning environment by following the instructions at https://github.com/Unity-Technologies/ml-agents 
1 Academy 
 
Only One Academy can be put in a Unity Scene. The Academy can have many different Brains and each are placed into the Academy. Academy is the parent to the Brains and the Brains are placed as children to the Academy. 
Each Unity Scene can have a max of One Academy with many brains as children to the Academy and many agents can be in the scene. Each Agent has one Brain.  On the right of the screenshot, you can see the options for customizing the Academy characteristics: Max Steps, Wait Time, Frames to Skip, etc.
1 Brain
 
I drag my Brain object into the Academy and the Brain characteristics are defined in the Inspector (right side of this screenshot). I chose to use the External Brain.
External & Internal Brain : action decisions are made with TensorFlow via open socket with Python API. I ran this to create a .bytes file that I later dragged into my Unity project and used with an Internal Brain setting.
1 Agent
I setup one agent with MonigarrAgent.cs script and attached to my Cricket game object. I made my Agent a Prefab and instantiate it during steps and resets. I can make many agents that all use the one brain I setup or I can create many different brains for various agents. I setup this project with 1 brain and 1 agent so I can focus on learning about the magic of ML-Agents.
TensorBoard summaries JimminyCrickets.exe
Tensorboard provides charts showing the training progress. I do not fully understand the Tensorboard charts YET, but more reading & experimenting is on my schedule this week after the ml-agents challenge final due date is over.................
Training Round One
 
 

step 0
max_steps  50000
run_path  ppo
env_name  JimminyCrickets.exe
curriculum_file  None
gamma  0.99
lambd  0.95
time_horizon  2048
beta  0.001
num_epoch  5
epsilon  0.2
buffer_size  2048
leaning_rate  0.0003
hidden_units  64
batch_size  64
Training Round Five
 
 
Future Goals for Jimminy Crickets:
I hope to find time to do more interesting experiments with this Unity project that represent a typical cricket's lifecycle.  A typical cricket completes it's metamorphosis process in about two months (egg, nymph and adult). Maybe I can make three brains to represent each stage of a cricket's metamorphosis. Details about cricket's lifecycle : https://cricketcare.org/life-cycle/   Perhaps something like the following will be a good start:
BrainEgg : 14 days as egg capsule. Digs out of substrate. A cricket begins its life in an egg. After about 14 days, it will have developed into a nymph. It will break the egg capsule and dig out of the substrate.  
BrainNymph : no wings, no ovipositors, prey for larger crickets & insects, molts 8 to 10 times, wings start growing & grow for about a month. Nymphs look like small versions of adult crickets with a few differences. They are not as developed so initially do not have wings and females do not have ovipositors. These young crickets often become prey for larger crickets and other insects. In order to grow, a nymph has to shed its hard exoskeleton. This process is called molting and happens 8 to 10 times. The new exoskeleton is milky white and soft until it hardens in a few hours. A nymph will begin growing its wings after about a month. 
BrainAdultMale : full grown wings to fly, eats nymphs, attracts fertile mates.  Once a cricket reaches maturity its wings are fully developed and it only has two goals: eating and mating. A male will attempt to attract fertile females. Once mating has occurred, a female will spend her time finding suitable places to lay her eggs.
BrainAdultFemaleFertile: full grown wings to fly, eats nymphs, attracts adult males to mate, lays eggs in suitable spots. Once a cricket reaches maturity its wings are fully developed and it only has two goals: eating and mating. A male will attempt to attract fertile females. Once mating has occurred, a female will spend her time finding suitable places to lay her eggs.
Cricket Lifecycle : Metamorphosis
Reward Function
Based on what I'm reading about ml-agents, maybe my future reward functions will each be reset back to 0 at each step and the reward will be incrememented so it won't lose info when using skipFrame. Details are explained in the Unity ML-Agents Github repository.
Positive Rewards:
Reach Food Source : nymphs, grass, water
Eat Food Source : nymphs, grass, water
Reach Potential Mate & Mate : adult cricket of opposite gender & fertile
Lay Eggs in soil
Stay Alive : does not get eaten or killed
Finish Game Level Alive as egg, nymph or adult
Negative Rewards:
Damage : not eating
Damage : reach potential mate that is Not fertile
Damage : reach adult cricket this is same gender
Death : eaten by another cricket
Death : eaten by happy bird
Death : do not finish game level