Author: DJ Bro Bot

DCGAN and PGGAN and PAGAN

Post author By DJ Bro Bot
Post date August 2, 2020

GAN – Generative Adversarial Networks

It looks like the main use of GANs, when not generating things that don’t exist, is to generate sample datasets based on real datasets, to increase the sample size of training data for some machine learning task, like detecting tomato diseases, or breast cancer.

The papers all confirm that it generates fake data that is pretty much indistinguishable from the real stuff.

DCGAN – Deep Convolutional GAN – https://arxiv.org/pdf/1511.06434.pdf https://github.com/carpedm20/DCGAN-tensorflow

PGGAN – Progressively Growing GAN – https://arxiv.org/pdf/1710.10196.pdf https://github.com/akanimax/pro_gan_pytorch

PA-GAN – Progressive Attention GAN – https://deepai.org/publication/pa-gan-progressive-attention-generative-adversarial-network-for-facial-attribute-editing – https://github.com/LynnHo/PA-GAN-Tensorflow

Examining the Capability of GANs to Replace Real
Biomedical Images in Classification Models Training

(Trying to generate Chest XRays and Histology images for coming up with new material for datasets)

https://arxiv.org/pdf/1904.08688.pdf –

Interesting difference between the algorithms, like the PGGANs didn’t mess up male and female body halves. Lots of talk about ‘model collapse’ – https://www.geeksforgeeks.org/modal-collapse-in-gans/

Modal Collapse in GANs

25-06-2019

Prerequisites: General Adversarial Network

Although Generative Adversarial Networks are very powerful neural networks which can be used to generate new data similar to the data upon which it was trained upon, It is limited in the sense that that it can be trained upon only single-modal data ie Data whose dependent variable consists of only one categorical entry.

If a Generative Adversarial Network is trained on multi-modal data, it leads to Modal Collapse. Modal Collapse refers to a situation in which the generator part of the network generates only a limited amount of variety of samples regardless of the input. This means that when the network is trained upon a multi-modal data directly, the generator learns to fool the discriminator by generating only a limited variety of data.

The following flow-chart illustrates training of a Generative Adversarial Network when trained upon a dataset containing images of cats and dogs:

The following approaches can be used to tackle Modal Collapse:-

Grouping the classes: One of the primary methods to tackle Modal Collapse is to group the data according to the different classes present in the data. This gives the discriminator the power to discriminate against sub-batches and determine whether a given batch is real or fake.
Anticipating Counter-actions: This method focuses on removing the situation of the discriminator “chasing” the generator by training the generator to maximally fool the discriminator by taking into account the counter-actions of the discriminator. This method has the downside of increased training time and complicated gradient calculation.
Learning from Experience: This approach involves training the discriminator on the old fake samples which were generated by the generator in a fixed number of iterations.
Multiple Networks: This method involves training multiple Generative networks for each different class thus covering all the classes of the data. The disadvantages include increased training time and typical reduction in the quality of the generated data.

Oh wow so many GANs out there:

PATE-GAN, GANSynth, ProbGAN, InstaGAN, RelGAN, MisGAN, SPIGAN, LayoutGAN, KnockoffGAN

dev Linux

Experiment Analysis and Wheels

Post author By DJ Bro Bot
Post date July 30, 2020

The Ray framework has this Analysis class (and Experiment Analysis class), and I noticed the code was kinda buggy, because it should have handled episode_reward_mean being NaN, better https://github.com/ray-project/ray/issues/9826 (episode_reward_mean is an averaged value, so appears as NaN (Not a Number) for the first few rollouts) . It was fixed a mere 18 days ago, so I can download the nightly release instead.

# pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.9.0.dev0-cp37-cp37m-manylinux1_x86_64.whl

or, since I still have 3.6 installed,

# pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.9.0.dev0-cp36-cp36m-manylinux1_x86_64.whl

Successfully installed aioredis-1.3.1 blessings-1.7 cachetools-4.1.1 colorful-0.5.4 contextvars-2.4 google-api-core-1.22.0 google-auth-1.20.0 googleapis-common-protos-1.52.0 gpustat-0.6.0 hiredis-1.1.0 immutables-0.14 nvidia-ml-py3-7.352.0 opencensus-0.7.10 opencensus-context-0.1.1 pyasn1-0.4.8 pyasn1-modules-0.2.8 ray-0.9.0.dev0 rsa-4.6

https://github.com/openai/gym/issues/1153

Phew. Wheels. Better than eggs, old greybeard probably said. https://pythonwheels.com/

What are wheels?

Wheels are the new standard of Python distribution and are intended to replace eggs.

The Sentient Table

Yee hah!

Post author By DJ Bro Bot
Post date July 24, 2020

Vision

DSO: Direct Sparse Odometry

Post author By DJ Bro Bot
Post date July 24, 2020

It seems I didn’t mention this algorithm, but it’s a SLAM-like point cloud thing I must have played with, as I came across this screenshot

Tips section from https://github.com/JakobEngel/dso

Accurate Geometric Calibration

Please have a look at Chapter 4.3 from the DSO paper, in particular Figure 20 (Geometric Noise). Direct approaches suffer a LOT from bad geometric calibrations: Geometric distortions of 1.5 pixel already reduce the accuracy by factor 10.
Do not use a rolling shutter camera, the geometric distortions from a rolling shutter camera are huge. Even for high frame-rates (over 60fps).
Note that the reprojection RMSE reported by most calibration tools is the reprojection RMSE on the “training data”, i.e., overfitted to the the images you used for calibration. If it is low, that does not imply that your calibration is good, you may just have used insufficient images.
try different camera / distortion models, not all lenses can be modelled by all models.

Photometric Calibration

Use a photometric calibration (e.g. using https://github.com/tum-vision/mono_dataset_code ).

Translation vs. Rotation

DSO cannot do magic: if you rotate the camera too much without translation, it will fail. Since it is a pure visual odometry, it cannot recover by re-localizing, or track through strong rotations by using previously triangulated geometry…. everything that leaves the field of view is marginalized immediately.

AI/ML Locomotion The Sentient Table

Tensorboard

Post author By DJ Bro Bot
Post date July 24, 2020

Tensorboard is TensorFlow’s graphs website at localhost:6006

tensorboard –logdir=.

tensorboard –logdir=/root/ray_results/ for all the experiments

I ran the ARS algorithm with Ray, on the robotable environment, and left it running for a day with the UI off. I set it up to run Tune, but the environments are 400MB of RAM each, so it’s pretty close to the 4GB in this laptop, so I was only running a single experiment.

So the next thing is to get it to start play back from a checkpoint.

(A few days pass, the github issue I had was something basic, that I thought I’d checked.)

So now I have a process where it’s running 100 iterations, then uses the best checkpoint as the starting policy for the next 100 iterations. Now it might just be wishful thinking, but i do actually see a positive trend through the graphs, in ‘wall’ view. There’s also lots of variation of falling over, so I think we might just need to get these hyperparameters tuning. (Probably need to tweak reward weights too. But lol, giving AI access to its own reward function… )

Just a note on that, the AI will definitely just be like, *999999

After training it overnight, with the PBT & ARS, it looks like one policy really beat out the other ones.

It ran a lot longer than the others.

3D Research envs simulation

Habitat-Sim

Post author By DJ Bro Bot
Post date July 20, 2020

https://github.com/facebookresearch/habitat-sim

A flexible, high-performance 3D simulator with configurable agents, multiple sensors, and generic 3D dataset handling (with built-in support for MatterPort3D, Gibson, Replica, and other datasets

dev

Ray/RLLib PBT & ARS

Post author By DJ Bro Bot
Post date July 19, 2020

I’ve posted an issue to try get ARS working. https://github.com/ray-project/ray/issues/9573

  File "/usr/local/lib/python3.6/dist-packages/ray/rllib/agents/ars/ars_tf_policy.py", line 59, in compute_actions
    observation = self.observation_filter(observation[None], update=update)
TypeError: list indices must be integers or slices, not NoneType

No idea yet, but some other bug mentioned something about numpy arrays vs. (good old) arrays. But anyhow, would be great if I can get ARS working on Ray/RLLib, because I just get the sense that PPO is too dumb. It’s never managed to get past falling over, with quite a bit a hyperparam tweaking.

At least ARS has evolved a walking table, so far. When it works in Ray, perhaps we will have the policy save and load, and I can move onto replaying experiences, or continuing training at a checkpoint, etc.

Huh great, well I solved my problems. and it’s running something now.

But rollouts are not ending now. Ok it looks like I need to put a time limit for the environment in the environment, rather than it being a hyperparameter like in pyBullet’s implementation.

Well then, onto the next issue.

deep dev

Ray / RLLib PBT & PPO

Post author By DJ Bro Bot
Post date July 19, 2020

Got Population Based Training PPO running in RLLib. It seems to have maxed out rewards. (Asymptotically approaching 0.74).

PPO isn’t great for this. But let’s see if we can replay with GUI after this.

I asked for these

hyperparam_mutations={
    "lambda": lambda: random.uniform(0.9, 1.0),
    "clip_param": lambda: random.uniform(0.01, 0.5),
    "lr": [1e-3, 5e-4, 1e-4, 5e-5, 1e-5]
    # ,
    # "num_sgd_iter": lambda: random.randint(1, 30),
    # "sgd_minibatch_size": lambda: random.randint(128, 16384),
    # "train_batch_size": lambda: random.randint(2000, 160000),
})

cat pbt_global.txt
["5", "7", 17, 18, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.95, "clip_param": 0.2, "lr": 0.0001, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.76, "clip_param": 0.16000000000000003, "lr": 5e-05, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}]

["3", "1", 35, 32, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.95, "clip_param": 0.2, "lr": 0.0001, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 1.14, "clip_param": 0.1096797541550122, "lr": 5e-05, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}]

["3", "7", 35, 36, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.95, "clip_param": 0.2, "lr": 0.0001, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.76, "clip_param": 0.24, "lr": 0.001, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}]

["5", "6", 37, 35, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 0.95, "clip_param": 0.2, "lr": 0.0001, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}, {"env": "RobotableEnv-v0", "kl_coeff": 1.0, "num_workers": 2, "num_gpus": 0, "model": {"free_log_std": true}, "lambda": 1.14, "clip_param": 0.16000000000000003, "lr": 5e-05, "num_sgd_iter": 20, "sgd_minibatch_size": 500, "train_batch_size": 10000}]





== Status ==
Memory usage on this node: 2.7/3.8 GiB
PopulationBasedTraining: 28 checkpoints, 3 perturbs
Resources requested: 3/4 CPUs, 0/0 GPUs, 0.0/0.93 GiB heap, 0.0/0.29 GiB objects
Result logdir: /root/ray_results/PBT_ROBOTABLE
Number of trials: 8 (7 PAUSED, 1 RUNNING)
+---------------------------------+----------+-----------------------+--------+------------------+--------+----------+
| Trial name                      | status   | loc                   |   iter |   total time (s) |     ts |   reward |
|---------------------------------+----------+-----------------------+--------+------------------+--------+----------|
| PPO_RobotableEnv-v0_c67a8_00000 | PAUSED   |                       |     36 |         1069.1   | 360000 | 0.735323 |
| PPO_RobotableEnv-v0_c67a8_00001 | PAUSED   |                       |     36 |         1096.3   | 360000 | 0.736305 |
| PPO_RobotableEnv-v0_c67a8_00002 | PAUSED   |                       |     33 |          987.687 | 330000 | 0.735262 |
| PPO_RobotableEnv-v0_c67a8_00003 | PAUSED   |                       |     36 |         1096.22  | 360000 | 0.731993 |
| PPO_RobotableEnv-v0_c67a8_00004 | PAUSED   |                       |     37 |         1103.48  | 370000 | 0.739188 |
| PPO_RobotableEnv-v0_c67a8_00005 | RUNNING  | 192.168.101.127:14690 |     37 |         1101.5   | 370000 | 0.727506 |
| PPO_RobotableEnv-v0_c67a8_00006 | PAUSED   |                       |     35 |         1067.26  | 350000 | 0.739985 |
| PPO_RobotableEnv-v0_c67a8_00007 | PAUSED   |                       |     36 |         1085.05  | 360000 | 0.739295 |
+---------------------------------+----------+-----------------------+--------+------------------+--------+----------+


2020-07-19 17:27:53,966	INFO pbt.py:78 -- [explore] perturbed config from {'env': 'RobotableEnv-v0', 'kl_coeff': 1.0, 'num_workers': 2, 'num_gpus': 0, 'model': {'free_log_std': True}, 'lambda': 0.95, 'clip_param': 0.2, 'lr': 0.0001, 'num_sgd_iter': 20, 'sgd_minibatch_size': 500, 'train_batch_size': 10000} -> {'env': 'RobotableEnv-v0', 'kl_coeff': 1.0, 'num_workers': 2, 'num_gpus': 0, 'model': {'free_log_std': True}, 'lambda': 1.14, 'clip_param': 0.16000000000000003, 'lr': 5e-05, 'num_sgd_iter': 20, 'sgd_minibatch_size': 500, 'train_batch_size': 10000}
2020-07-19 17:27:53,966	INFO pbt.py:316 -- [exploit] transferring weights from trial PPO_RobotableEnv-v0_c67a8_00006 (score 0.7399848299949074) -> PPO_RobotableEnv-v0_c67a8_00005 (score 0.7241841897925536)
Result for PPO_RobotableEnv-v0_c67a8_00005:
  custom_metrics: {}
  date: 2020-07-19_17-27-53
  done: false
  episode_len_mean: 114.58
  episode_reward_max: 0.7808001167724908
  episode_reward_mean: 0.7241841897925536
  episode_reward_min: 0.6627154081217708
  episodes_this_iter: 88
  episodes_total: 2500
  experiment_id: e3408f32ed2a433d8c7edb87d33609ba
  experiment_tag: 5@perturbed[clip_param=0.16,lambda=1.14,lr=5e-05]
  hostname: chrx
  info:
    learner:
      default_policy:
        cur_kl_coeff: 0.0625
        cur_lr: 4.999999873689376e-05
        entropy: 5.101933479309082
        entropy_coeff: 0.0
        kl: 0.004210006445646286
        model: {}
        policy_loss: -0.0077978381887078285
        total_loss: -0.007088268641382456
        vf_explained_var: 0.9757658243179321
        vf_loss: 0.0004464423400349915
    num_steps_sampled: 380000
    num_steps_trained: 380000
  iterations_since_restore: 5
  node_ip: 192.168.101.127
  num_healthy_workers: 2
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 66.7095238095238
    ram_util_percent: 72.5452380952381
  pid: 14690
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_env_wait_ms: 1.5935033550679747
    mean_inference_ms: 1.8385610163959398
    mean_processing_ms: 1.195529456155168
  time_since_restore: 147.82027745246887
  time_this_iter_s: 29.546902656555176
  time_total_s: 1131.04909491539
  timers:
    learn_throughput: 1880.23
    learn_time_ms: 5318.497
    load_throughput: 350730.091
    load_time_ms: 28.512
    sample_throughput: 414.501
    sample_time_ms: 24125.418
    update_time_ms: 4.191
  timestamp: 1595179673
  timesteps_since_restore: 0
  timesteps_total: 380000
  training_iteration: 38
  trial_id: c67a8_00005
  
2020-07-19 17:27:54,989	WARNING util.py:137 -- The `experiment_checkpoint` operation took 0.8819785118103027 seconds to complete, which may be a performance bottleneck.
== Status ==
Memory usage on this node: 2.6/3.8 GiB
PopulationBasedTraining: 28 checkpoints, 4 perturbs
Resources requested: 0/4 CPUs, 0/0 GPUs, 0.0/0.93 GiB heap, 0.0/0.29 GiB objects
Result logdir: /root/ray_results/PBT_ROBOTABLE
Number of trials: 8 (8 PAUSED)
+---------------------------------+----------+-------+--------+------------------+--------+----------+
| Trial name                      | status   | loc   |   iter |   total time (s) |     ts |   reward |
|---------------------------------+----------+-------+--------+------------------+--------+----------|
| PPO_RobotableEnv-v0_c67a8_00000 | PAUSED   |       |     36 |         1069.1   | 360000 | 0.735323 |
| PPO_RobotableEnv-v0_c67a8_00001 | PAUSED   |       |     36 |         1096.3   | 360000 | 0.736305 |
| PPO_RobotableEnv-v0_c67a8_00002 | PAUSED   |       |     33 |          987.687 | 330000 | 0.735262 |
| PPO_RobotableEnv-v0_c67a8_00003 | PAUSED   |       |     36 |         1096.22  | 360000 | 0.731993 |
| PPO_RobotableEnv-v0_c67a8_00004 | PAUSED   |       |     37 |         1103.48  | 370000 | 0.739188 |
| PPO_RobotableEnv-v0_c67a8_00005 | PAUSED   |       |     38 |         1131.05  | 380000 | 0.724184 |
| PPO_RobotableEnv-v0_c67a8_00006 | PAUSED   |       |     35 |         1067.26  | 350000 | 0.739985 |
| PPO_RobotableEnv-v0_c67a8_00007 | PAUSED   |       |     36 |         1085.05  | 360000 | 0.739295 |
+---------------------------------+----------+-------+--------+------------------+--------+----------+


(pid=14800) 2020-07-19 17:27:58,611	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
(pid=14800) 2020-07-19 17:27:58,611	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=14800) pybullet build time: Mar 17 2020 17:46:41
(pid=14800) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=14800)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
(pid=14913) pybullet build time: Mar 17 2020 17:46:41
(pid=14913) 2020-07-19 17:28:00,118	INFO trainer.py:585 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
(pid=14913) 2020-07-19 17:28:00,118	INFO trainer.py:612 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=14913) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=14913)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
(pid=14992) pybullet build time: Mar 17 2020 17:46:41
(pid=14993) pybullet build time: Mar 17 2020 17:46:41
(pid=14992) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=14992)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
(pid=14800) 2020-07-19 17:28:10,106	INFO trainable.py:181 -- _setup took 11.510 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
(pid=14993) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=14993)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
(pid=14800) 2020-07-19 17:28:10,126	WARNING util.py:37 -- Install gputil for GPU system monitoring.
(pid=14800) 2020-07-19 17:28:10,717	INFO trainable.py:423 -- Restored on 192.168.101.127 from checkpoint: /root/ray_results/PBT_ROBOTABLE/PPO_RobotableEnv-v0_5_2020-07-19_15-00-03bbqeih3t/tmpf1h5txefrestore_from_object/checkpoint-35
(pid=14800) 2020-07-19 17:28:10,717	INFO trainable.py:430 -- Current state after restoring: {'_iteration': 35, '_timesteps_total': None, '_time_total': 1067.2641203403473, '_episodes_total': 2289}
(pid=14913) 2020-07-19 17:28:12,388	INFO trainable.py:181 -- _setup took 12.284 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
(pid=14913) 2020-07-19 17:28:12,388	WARNING util.py:37 -- Install gputil for GPU system monitoring.
(pid=14913) 2020-07-19 17:28:12,760	INFO trainable.py:423 -- Restored on 192.168.101.127 from checkpoint: /root/ray_results/PBT_ROBOTABLE/PPO_RobotableEnv-v0_2_2020-07-19_14-52-33cutk2k27/tmplqac6svyrestore_from_object/checkpoint-33
(pid=14913) 2020-07-19 17:28:12,760	INFO trainable.py:430 -- Current state after restoring: {'_iteration': 33, '_timesteps_total': None, '_time_total': 987.687007188797, '_episodes_total': 2059}
(pid=15001) pybullet build time: Mar 17 2020 17:46:41
(pid=15001) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=15001)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
(pid=15088) pybullet build time: Mar 17 2020 17:46:41
(pid=15088) /usr/local/lib/python3.6/dist-packages/gym/logger.py:30: UserWarning: WARN: gym.spaces.Box autodetected dtype as <class 'numpy.float32'>. Please provide explicit dtype.
(pid=15088)   warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
Result for PPO_RobotableEnv-v0_c67a8_00002:
  custom_metrics: {}
  date: 2020-07-19_17-28-54
  done: false
  episode_len_mean: 110.78888888888889
  episode_reward_max: 0.8009732276880979
  episode_reward_mean: 0.7387077080695522
  episode_reward_min: 0.6640543988817607
  episodes_this_iter: 90
  episodes_total: 2149
  experiment_id: edcd859a3ae34d668bb9be1899dde41a
  experiment_tag: '2'
  hostname: chrx
  info:
    learner:
      default_policy:
        cur_kl_coeff: 1.0
        cur_lr: 9.999999747378752e-05
        entropy: 5.111008644104004
        entropy_coeff: 0.0
        kl: 0.0031687873415648937
        model: {}
        policy_loss: -0.012367220595479012
        total_loss: -0.008663905784487724
        vf_explained_var: 0.9726411700248718
        vf_loss: 0.0005345290992408991
    num_steps_sampled: 340000
    num_steps_trained: 340000
  iterations_since_restore: 1
  node_ip: 192.168.101.127
  num_healthy_workers: 2
  off_policy_estimator: {}
  perf:
    cpu_util_percent: 68.11833333333333
    ram_util_percent: 71.13666666666667
  pid: 14913
  policy_reward_max: {}
  policy_reward_mean: {}
  policy_reward_min: {}
  sampler_perf:
    mean_env_wait_ms: 1.6718764134441182
    mean_inference_ms: 1.9752634594235934
    mean_processing_ms: 1.2958259778937158
  time_since_restore: 41.650487661361694
  time_this_iter_s: 41.650487661361694
  time_total_s: 1029.3374948501587
  timers:
    learn_throughput: 1680.106
    learn_time_ms: 5952.007
    load_throughput: 74973.795
    load_time_ms: 133.38
    sample_throughput: 285.094
    sample_time_ms: 35076.171
    update_time_ms: 4.517
  timestamp: 1595179734
  timesteps_since_restore: 0
  timesteps_total: 340000
  training_iteration: 34
  trial_id: c67a8_00002
  
2020-07-19 17:28:55,042	WARNING util.py:137 -- The `experiment_checkpoint` operation took 0.5836038589477539 seconds to complete, which may be a performance bottleneck.
== Status ==
Memory usage on this node: 2.7/3.8 GiB
PopulationBasedTraining: 28 checkpoints, 4 perturbs
Resources requested: 3/4 CPUs, 0/0 GPUs, 0.0/0.93 GiB heap, 0.0/0.29 GiB objects
Result logdir: /root/ray_results/PBT_ROBOTABLE
Number of trials: 8 (7 PAUSED, 1 RUNNING)
+---------------------------------+----------+-----------------------+--------+------------------+--------+----------+
| Trial name                      | status   | loc                   |   iter |   total time (s) |     ts |   reward |
|---------------------------------+----------+-----------------------+--------+------------------+--------+----------|
| PPO_RobotableEnv-v0_c67a8_00000 | PAUSED   |                       |     36 |          1069.1  | 360000 | 0.735323 |
| PPO_RobotableEnv-v0_c67a8_00001 | PAUSED   |                       |     36 |          1096.3  | 360000 | 0.736305 |
| PPO_RobotableEnv-v0_c67a8_00002 | RUNNING  | 192.168.101.127:14913 |     34 |          1029.34 | 340000 | 0.738708 |
| PPO_RobotableEnv-v0_c67a8_00003 | PAUSED   |                       |     36 |          1096.22 | 360000 | 0.731993 |
| PPO_RobotableEnv-v0_c67a8_00004 | PAUSED   |                       |     37 |          1103.48 | 370000 | 0.739188 |
| PPO_RobotableEnv-v0_c67a8_00005 | PAUSED   |                       |     38 |          1131.05 | 380000 | 0.724184 |
| PPO_RobotableEnv-v0_c67a8_00006 | PAUSED   |                       |     35 |          1067.26 | 350000 | 0.739985 |
| PPO_RobotableEnv-v0_c67a8_00007 | PAUSED   |                       |     36 |          1085.05 | 360000 | 0.739295 |
+---------------------------------+----------+-----------------------+--------+------------------+--------+----------+

AI/ML control deep envs

Flow

Post author By DJ Bro Bot
Post date July 17, 2020

https://flow-project.github.io/index.html

Seems like a framework with some sort of traffic environments

Math

Hessian Matrix

Post author By DJ Bro Bot
Post date July 17, 2020

https://en.wikipedia.org/wiki/Hessian_matrix some sort of crazy math thing I’ve come across while researching NNs. My own opinion is that neurons in the brain are not doing any calculus, and to my knowledge, there’s still no proof of back-propagation in the brain, so whenever the math gets too complicated, there’s probably a simpler way to do it.

In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term “functional determinants”.

https://stackoverflow.com/questions/23297090/how-calculating-hessian-works-for-neural-network-learning

https://activecalculus.org/multi/S-10-3-Second-Order-Partial-Derivatives.html

Second-order Partial Derivatives

A function ff of two independent variables xx and yy has two first order partial derivatives, fxfx and fy.fy. Each of these first-order partial derivatives has two partial derivatives, giving a total of four second-order partial derivatives:

fxx=(fx)x=∂∂x(∂f∂x)=∂2f∂x2,fxx=(fx)x=∂∂x(∂f∂x)=∂2f∂x2,
fyy=(fy)y=∂∂y(∂f∂y)=∂2f∂y2,fyy=(fy)y=∂∂y(∂f∂y)=∂2f∂y2,
fxy=(fx)y=∂∂y(∂f∂x)=∂2f∂y∂x,fxy=(fx)y=∂∂y(∂f∂x)=∂2f∂y∂x,
fyx=(fy)x=∂∂x(∂f∂y)=∂2f∂x∂y.fyx=(fy)x=∂∂x(∂f∂y)=∂2f∂x∂y.

The first two are called unmixed second-order partial derivatives while the last two are called the mixed second-order partial derivatives.