A modern data augmentation library


[ad_1]

Recent advances in Deep Learning and Deep Learning models have largely been linked to the amount and diversity of data collected. The increase in data allows practitioners to dramatically increase the variety of data available for training models without actually collecting new data. Data augmentation is a technique that allows you to artificially create new training data from existing training data. This is done by applying specific techniques to the training data to create new and different training examples.. The most commonly used data augmentation techniques to train large neural networks are cropping, padding, and horizontal flipping. Most of the approaches used in training neural networks only use basic types of augmentation. Although neural network architectures have been extensively studied, solid types of data augmentation and data augmentation policies that capture data invariances have yet to be discovered.

During training, even after using a suitable model, we do not achieve satisfactory results. In the end, it all comes down to the data used to train the network. Having a large data set is crucial for the performance of a deep learning model. The lack of quantity and diversity of data thus adversely affects the performance of the model. Data Augmentation helps us increase the size of the dataset and introduce variability into the dataset. When increasing data, the neural network treats each data as a separate data point. The increase in data also helps reduce the problem of overfitting.

Register for our Workshop>>

Our dataset may contain images taken under a limited set of conditions, but we may fail under various conditions; modified / augmented data helps to cope with such scenarios. We only need to make minor changes to our existing training data to get more data, especially when we are talking about increasing image data. The changes can include flipping the image horizontally, vertically, padding, cropping, rotating, scaling, and a few other translations of the input image. For machine learning and deep learning models, collecting and labeling data can be exhausting and expensive. Transformations of datasets using data augmentation techniques allow organizations to reduce these unwanted additional operational costs. Companies also need to put in place systems to assess the quality of augmented data sets. As the use of data augmentation methods increases, evaluation of the quality of their output will be highly necessary.

A more ambitious data augmentation technique is to take advantage of segmentation annotations, obtained manually or from an automatic segmentation system, and create new images with objects placed at different positions in existing scenes.

Image source

What is Augly?

AugLy, a new open source Python library that helps AI researchers use and create data augmentations to improve the robustness of their created machine learning models. Increases can include a wide variety of edits to a data set, from cropping a photo to changing the pitch of a voice recording. In addition, AugLy provides sophisticated data augmentation tools to create samples for training and testing different systems. AugLy currently supports four modalities namely audio, image text and video and over 100 other augmentations. The increases for each modality are contained in its sub-library. These sub-libraries can help perform function-based and class-based transformations, composition operators and are also available with an option to provide metadata about the applied transformation.

AugLy is a great library to use for augmenting data when training the model. emojis on pictures and videos or by reposting a screenshot on social media. It has four sub-libraries, each corresponding to a different modality, following the same interface. AugLy can also generate useful metadata to help you understand how your data has been transformed.

Image source

Getting started with the code

This article will attempt to explore the different augmentation services available in the AugLy library developed by Facebook’s artificial intelligence research team. We will perform increases on image and video data, using components from the library. The following implementation is partially inspired by the creators of AugLy, whose GitHub repository can be accessed using the link here.

Image data augmentation: library installation

We will first perform an image augmentation using Augly. To do this, we will first need to install the library required for image-based augmentation. This can be done using the following code,

!pip install -U augly
!sudo apt-get install python3-magic
Importing dependencies

Now that our library is installed, we will import other dependencies,

See also

import os
import augly.image as imaugs
import augly.utils as utils
from IPython.display import display
Adjusting the Image Path and Size

We are now going to set the image path and resize it so that the image, once displayed, does not take up the whole screen where we cannot see the changes.

# Setting image path, scaling it down
input_img_path = "/content/856136_new-black-and-white-2016-drake-4k-wallpapers_1200x675_h.png"
input_img = imaugs.scale(input_img_path, factor=0.5)
 
# We can use the AugLy scale augmentation
input_img = imaugs.scale(input_img_path, factor=0.7)
display(input_img)

Production:

Make increases

Let’s do some increases and explore what the library has to offer,

# applying various augmentations to the scaled image
#creating a meme
display(
    imaugs.meme_format(
        input_img,
        caption_height=100,
        meme_bg_color=(20, 19, 21),
        text_color=(255, 255, 255),
    )
)
#degrading image pixels
meta = []
display(imaugs.shuffle_pixels(input_img, factor=0.3, metadata=meta))
meta
[{'dst_height': 420,
  'dst_width': 560,
  'factor': 0.3,
  'intensity': 30.0,
  'name': 'shuffle_pixels',
  'output_path': None,
  'seed': 10,
  'src_height': 420,
  'src_width': 560}]


# Image Desizing
meta = []
aug = imaugs.PerspectiveTransform(sigma=20.0)
display(aug(input_img, metadata=meta))
meta
#Changing Aspect Ratio
meta = []
aug = imaugs.RandomAspectRatio()
display(aug(input_img))
meta
# Applying several transformations together to create a new image
aug = imaugs.Compose(
    [
        imaugs.Saturation(factor=0.7),
        imaugs.OverlayOntoScreenshot(
            template_filepath=os.path.join(
                utils.SCREENSHOT_TEMPLATES_DIR, "mobile.png"
            ),
        ),
        imaugs.Scale(factor=0.9),
    ]
)
display(aug(input_img))
Increased video data

Just like images, similar increases can also be done on video data. To do this, we will follow the same process. First, install the video augmentation library and its dependencies, then we will use different sets of codes to perform augmentation on the input video.

#instaling the AugLy Video Library
 
!pip install -U augly[av]
!sudo apt-get install python3-magic

# Installing ffmpeg for the video module of augly
!sudo add-apt-repository ppa:jonathonf/ffmpeg-4
!apt install ffmpeg

#setting video hyperparameters
 
from IPython.display import display, HTML
from base64 import b64encode
 
def display_video(path):
  mp4 = open(path,'rb').read()
  data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
  display(
    HTML(
      """
          
      """ % data_url
    )
  )
 
import os
import augly.utils as utils
import augly.video as vidaugs
 
# Set video path and trim to first 3 seconds
input_video = os.path.join(
    utils.TEST_URI, "video", "inputs", "input_1.mp4"
)
input_vid_path = "/tmp/in_video.mp4"
out_vid_path = "/tmp/aug_video.mp4"
 
# We can use the AugLy trim augmentation, and save the trimmed video
vidaugs.trim(input_video, output_path=input_vid_path, start=0, end=3)
display_video(input_vid_path)
Perform different video augmentations

Now let’s do different types of augmentations on the input video,

# Overlaying the video with random text
vidaugs.overlay_text(input_vid_path, out_vid_path)
display_video(out_vid_path)
#code for looping video repeatedly
meta = []
vidaugs.loop(
    input_vid_path,
    out_vid_path,
    num_loops=1,
    metadata=meta,
)
display_video(out_vid_path)
meta

#displaying randomly generated emoji in video 
meta = []
aug = vidaugs.RandomEmojiOverlay()
aug(input_vid_path, out_vid_path, metadata=meta)
display_video(out_vid_path)
meta
# Degrading image quaity using blur
aug = vidaugs.Compose(
    [
        vidaugs.AddNoise(),
        vidaugs.Blur(sigma=5.0),
        vidaugs.OverlayDots(),
    ]
)
aug(input_vid_path, out_vid_path)
display_video(out_vid_path)

End Notes

In this article, we have tried to understand what exactly is image augmentation and how it works. We also delved into the uses of image augmentation and explored the AugLy library, which allows us to easily perform image, text, audio, and video augmentation. The above implementations can be found in Colab notebook form, using the following links:

The references


Join our Telegram group. Be part of an engaging online community. Join here.

Subscribe to our newsletter

Receive the latest updates and relevant offers by sharing your email.

[ad_2]

About Clara Barnard

Check Also

On-the-road review: Hyundai Ioniq5 Limited electric vehicle

Hyundai Motor Co. (including Kia and Genesis) will soon be America’s best-selling electric vehicle maker, …