This is a notably relevant paper from 2019, that appears to be keeping track of eggs
“Our custom SSD object detection and classification model classified when chickens and eggs were detected by the video camera. Our models can label video frames with classifications for 8 breeds of chickens and 4 colors of eggs, with 98% accuracy on chickens or eggs alone and 82.5% accuracy while detecting both types of objects.”
“Tuned accuracy is needed for proper thresholding of object detection”
This https://github.com/facebookresearch/meshrcnn is maybe getting closer to holy grail in my mind. I like the idea of bridging the gap between simulation and reality in the other direction too. By converting the world into object meshes. Real2Sim.
The OpenAI Rubik’s cube hand policy transfer was done with camera in simulation and camera in real world. This could allow a sort of dreaming, i.e., running simulations on new 3d obj data.)
It could acquire data that it could mull over, when chickens are asleep.
There’s no chicken category in Pix3d. But getting closer. Just need a chicken and egg dataset.
Downloading blender again, to check out the obj file that was generated. Ok Blender doesn’t want to show it, but here’s a handy site https://3dviewer.net/ to view OBJ files. The issue in blender required selecting the obj, then View > Frame Selected to make it zoom in. Switching to orthographic from perspective view also helps.
pip install pyyaml==5.1
Successfully built pyyaml
Installing collected packages: pyyaml
Attempting uninstall: pyyaml
Found existing installation: PyYAML 3.12
ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
pip3 install --ignore-installed PyYAML
Successfully installed PyYAML-5.1
Next error...
ModuleNotFoundError: No module named 'torchvision'
pip install torchvision
Next error...
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
ok
python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu
[08/17 20:53:11 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=None, opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl', 'MODEL.DEVICE', 'cpu'], output=None, video_input=None, webcam=True)
[08/17 20:53:12 fvcore.common.checkpoint]: Loading checkpoint from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:53:12 fvcore.common.file_io]: Downloading https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
[08/17 20:53:12 fvcore.common.download]: Downloading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
model_final_f10217.pkl: 178MB [01:26, 2.05MB/s]
[08/17 20:54:39 fvcore.common.download]: Successfully downloaded /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl. 177841981 bytes.
[08/17 20:54:39 fvcore.common.file_io]: URL https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl cached in /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:54:39 fvcore.common.checkpoint]: Reading a file from 'Detectron2 Model Zoo'
0it [00:00, ?it/s]/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
0it [00:06, ?it/s]
Traceback (most recent call last):
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'
Ok...
pip install opencv-python
Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (4.2.0.34)
Looks like 4.3.0 vs 4.2.0.34 kinda thing
sudo apt-get install libopencv-*
nope...
/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
def nonzero_tuple(x):
"""
A 'as_tuple=True' version of torch.nonzero to support torchscript.
because of https://github.com/pytorch/pytorch/issues/38718
"""
if x.dim() == 0:
return x.unsqueeze(0).nonzero().unbind(1)
return x.nonzero(as_tuple=True).unbind(1)
AttributeError: 'tuple' object has no attribute 'unbind'
https://github.com/pytorch/pytorch/issues/38718
FFS. Why does nothing ever fucking work ?
pytorch 1.6:
"putting 1.6.0 milestone for now; this isn't the worst, but it's a pretty bad user experience."
Yeah no shit.
let's try...
return x.nonzero(as_tuple=False).unbind(1)
Ok next error same
/opt/detectron2/detectron2/modeling/roi_heads/fast_rcnn.py:111
Ok... back to this error (after adding as_tuple=False twice)
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'
Decided to check if maybe this is a conda vs pip thing. Like maybe I just need to install the conda version instead?
But it looks like a GTK+ 2.x isn’t installed. Seems I installed it using pip, i.e. pip install opencv-contrib-python and that isn’t built with gtk+2.x. I can also use qt as the graphical interface.
“GTK supposedly uses more memory because GTK provides more functionality. Qt does less and uses less memory. If that is your logic, then you should also look at Aura and the many other user interface libraries providing less functionality.” (link )
FB really likes detecting things. I went with their PyTorch version. The matterport version didn’t work out of the box, so went with FB’s code to try image segmentation.
Segmenting an image by the watershed transformation is therefore a two-step process:
* Finding the markers and the segmentation criterion (the criterion or function which will be used to split the regions – it is most often the contrast or gradient, but not necessarily).
* Performing a marker-controlled watershed with these two elements.
Conclusion The explosion in computing power used for deep learning models has ended the “AI winter” and set new benchmarks for computer performance on a wide range of tasks. However, deep learning’s prodigious appetite for computing power imposes a limit on how far it can improve performance in its current form, particularly in an era when improvements in hardware performance are slowing. This article shows that the computational limits of deep learning will soon be constraining for a range of applications, making the achievement of important benchmark milestones impossible if current trajectories hold. Finally, we have discussed the likely impact of these computational limits: forcing Deep Learning towards less computationally-intensive methods of improvement, and pushing machine learning towards techniques that are more computationally-efficient than deep learning.
Yeah, well, the neocortex has like 7 “hidden” layers, with sparse distributions, with voting / normalising layers. Just a 3d graph of neurons, doing some wiggly things.
I was wondering what this ‘grad_norm’ parameter is. I got sidetracked by the paper with this name, which is maybe even unrelated. I haven’t gone to the Tune website yet to check if it’s the same thing. The grad_norm I was looking for is in the ARS code.
So, it’s the sum of the square of g… and g is… the total of the ‘batched weighted sum’ of the ARS perturbation results, I think… but it’s not theta… which is the policy itself. Right, so the policy is the current agent/model brains, and g would be the average of the rollout tests, and so grad_norm, as a sum of squares is like the square of variance, so it’s a sort of measure of how much weights are changing, i think.
# Compute and take a step.
g, count = utils.batched_weighted_sum(
noisy_returns[:, 0] - noisy_returns[:, 1],
(self.noise.get(index, self.policy.num_params)
for index in noise_idx),
batch_size=min(500, noisy_returns[:, 0].size))
g /= noise_idx.size
# scale the returns by their standard deviation
if not np.isclose(np.std(noisy_returns), 0.0):
g /= np.std(noisy_returns)
assert (g.shape == (self.policy.num_params, )
and g.dtype == np.float32)
# Compute the new weights theta.
theta, update_ratio = self.optimizer.update(-g)
# Set the new weights in the local copy of the policy.
self.policy.set_flat_weights(theta)
# update the reward list
if len(all_eval_returns) > 0:
self.reward_list.append(eval_returns.mean())
def batched_weighted_sum(weights, vecs, batch_size):
total = 0
num_items_summed = 0
for batch_weights, batch_vecs in zip(
itergroups(weights, batch_size), itergroups(vecs, batch_size)):
assert len(batch_weights) == len(batch_vecs) <= batch_size
total += np.dot(
np.asarray(batch_weights, dtype=np.float32),
np.asarray(batch_vecs, dtype=np.float32))
num_items_summed += len(batch_weights)
return total, num_items_summed
Well anyway. I found another GradNorm.
GradNorm
https://arxiv.org/pdf/1711.02257.pdf – “We present a gradient normalization (GradNorm) algorithm that automatically balances training in deep multitask models by dynamically tuning gradient magnitudes.”
Conclusions We introduced GradNorm, an efficient algorithm for tuning loss weights in a multi-task learning setting based on balancing the training rates of different tasks. We demonstrated on both synthetic and real datasets that GradNorm improves multitask test-time performance in a variety of scenarios, and can accommodate various levels of asymmetry amongst the different tasks through the hyperparameter α. Our empirical results indicate that GradNorm offers superior performance over state-of-the-art multitask adaptive weighting methods and can match or surpass the performance of exhaustive grid search while being significantly less time-intensive.
Looking ahead, algorithms such as GradNorm may have applications beyond multitask learning. We hope to extend the GradNorm approach to work with class-balancing and sequence-to-sequence models, all situations where problems with conflicting gradient signals can degrade model performance. We thus believe that our work not only provides a robust new algorithm for multitask learning, but also reinforces the powerful idea that gradient tuning is fundamental for training large, effective models on complex tasks.
…
The paper derived the formulation of the multitask loss based on the maximization of the Gaussian likelihood with homoscedastic* uncertainty. I will not go to the details here, but the simplified forms are strikingly simple.
In statistics, a sequence (or a vector) of random variables is homoscedastic /ˌhoʊmoʊskəˈdæstɪk/ if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity.
Tensorboard is TensorFlow’s graphs website at localhost:6006
tensorboard –logdir=.
tensorboard –logdir=/root/ray_results/ for all the experiments
I ran the ARS algorithm with Ray, on the robotable environment, and left it running for a day with the UI off. I set it up to run Tune, but the environments are 400MB of RAM each, so it’s pretty close to the 4GB in this laptop, so I was only running a single experiment.
So the next thing is to get it to start play back from a checkpoint.
(A few days pass, the github issue I had was something basic, that I thought I’d checked.)
So now I have a process where it’s running 100 iterations, then uses the best checkpoint as the starting policy for the next 100 iterations. Now it might just be wishful thinking, but i do actually see a positive trend through the graphs, in ‘wall’ view. There’s also lots of variation of falling over, so I think we might just need to get these hyperparameters tuning. (Probably need to tweak reward weights too. But lol, giving AI access to its own reward function… )
Just a note on that, the AI will definitely just be like, *999999
After training it overnight, with the PBT & ARS, it looks like one policy really beat out the other ones.
I’m sticking to the google protobuf code that the minitaur uses, and will just need to save the best episodes, and work out how to replay them. The comments ask “use recordio?”