Gitlab CI CD - From Zero to Hero - Techworld with Nana Course Notes

 

How Gitlab CI/CD compares to Jenkins

Jenkins:
- Still an industry leader
- powerful, open-source, big community
- flexible integrations through 1000s of community pluggins
- Old tool, which was not designed for the new container age
- Needs to install, configure or maintain additional pluggins
- only CI/CD tool
- self-hosting is the only option
GitLab CI/CD:
- GidLab is fully featured DevOps platform
- Keeps up with the industry developments
- Many features built-in: self-monitoring, Container Registry, Docker CI Runner, etc
- Allows keeping CI/CD & code management in the same place
- All in one solution
- Self-Hosted or SaaS (managed)
- Using CI/CD without overhead of setting it up yourself

Gitlab CI/CD Vs Azure Pipelines:

Azure pipelines:
- Best integration with Microsoft services, other integrations not so easy
- If you're using Azure platform, best choice
- Commercial
GitLab CI/CD:
- Best integration with other GitLab services, but other integrations are easy as well
- Open-source and commercial
➡️ Jobs: We can define arbitary names for our jobs.
➡️ Must contain at least the script clause.
➡️ Script specify the commands to execute
➡️ "before_script" - commands that should run before script command
➡️ "after_script" - define commands that run after each job, including failed jobs
➡️ On every commit, GitLab triggers the pipeline configuration automatically.
➡️ No webhooks needed, because Git repository and CI/CD functionality on the same platform

You can group multiple jobs into stages that run in a defined order. Multiple jobs in the same stage are executed in parallel

With "Stages" we can logically group jobs that belong together. Only when all jobs(in a stage) are successful, next will be executed.

"only"/"except" - Job keywords to control when jobs are executecd.
- only -> define when a job runs
- except -> define when a job does not run

- Image versioning: Different companies used different methods  like semantic numbering, date of release, unary numbering.

- By default, VARIABLE is only available in the job it was defined = job environment.
- Because, Every job gets executed in its own new environment(docker container)
- Artifacts are used to pass intermediate results between the jobs.
- We can use artifacts attribute in general to create job artifacts
- The artifacts are sent to GitLab after the job finishes and available for download in the GitLab UI(so if we want to pass variable between different jobs in the pipeline we can either cat the variables into a file and save the file in 'artifacts' so that these(file) will be automatically available in other jobs. We can cat them again from the same file and reference. If we want to refer the variables in the same stage we need to use 'dependencies' attribute. We can also use 'dotenv'(.env) to reference the variables, actually this is the recommended way because we can directly call the variables in the other jobs without again defining them(which we need to do if we use 'artifacts' to pass variables)

- By default: jobs in later stages automatically download all the artifacts created by jobs in earlier stages(but not to the same STAGE)
-To download the artifacts to the same stage use "needs" or "dependencies" attribute.
- Use "dependencies" attribute to define a list of jobs to fetch artifacts from the same stage
-"needs" attribute tells Gitlab that a the current stage(push_image) needs to wait with execution until provided stage(build_image) completed.
    
    Example: push_image:    
                needs:
                  - build_image
- Difference between "needs" and "dependencies" -->needs it telling 'build_image' job is needed to complete 'push_image' job. Means 'build_image' job should complete successfully before 'push_image' starts. If the mentioned stage(build_image) for "needs" failed, it will not execute the current stage(push_image). The main purpose of 'dependencies' is to tell GitLab the current job(push_image, where it mentioned) needs an artifact from provided job(build_image)
   Example: push_image:
                needs:
                  -build_image
                dependencies:
                  - build_image

   Another example:
        test_dev:
          stage: deploy
          script:
            - echo "testing dev deployment..."
          #dependencies: [] #- do not download any artifacts
          dependencies:
            - run_unit_tests #only this job artifacts downloaded.
When we are mentioning both 'needs' and 'dependencies' they impact each other. If we are using both attributes we should have same job name mentioned in both places. However following will work as well, but if we have different jobs, it will not download artifacts. Another intresting overlap is that when we use 'needs' it actually already implies 'dependencies'. Means, when we use 'needs' it will wait for the 'build_image' job to complete before starting push_image but it will also download the artifacts from that build_image job. So, no need to define dependencies when we already have 'needs' attribute in a job.
    Example: push_image:
                needs:
                  -build_image
                  -someother_job
                dependencies:
                  - build_image
- with 'needs' only artifacts from the jobs listed in the 'needs' will be downloaded.
- why? because jobs with 'needs' can start before earlier stages complete.

- To prevent 'artifacts' automatic download: if you configure a job with an empty array [], it will not download any artifacts. This will be helpful if we are producing lot of artifacts.
- To restrict the artifacts download: use below, it will only download the artifacts from 'test' stage(i.e., run_unit_test)
    Example
        dependencies:
          - run_unit_test
- What is a Dotenv file? - A lightweight npm package that automatically loads environment variables from a .env file into the process.
    Dotenv format:
    one variable definition per line
    each line must be of the form:
        VARIABLE_NAME=ANY_VAULE
- save the .env file as an 'dotenv' artifact.                
- Dotenv report collects the environment variables as artifacts.
    Example-
        artifacts:
          reports:
            dotenv: build.env
- Collected variables are registered as runtime-created variables of the job(Note: Variables cannot be used to configure a pipeline, but only in job scripts)
- Jobs in later stages can use the variables in scripts
- IF YOU DON'T WANT TO inherit the variable name that is already defined in the above stage/job, and you have same variable name in the current job as you want to define different value for this job. you can mention 'dependencies' with an empty arry [], not to inherit the value of the same variable name from other job/stage.
- we can use 'environment' to automatically create an endpoint, so that it will create a clickable link once pipeline is executed.
    environment:
      name: development
      url: http://end-point-url:3000
- What is the use of docker compose? --> when we want to run multiple containers, we can include all of them in .yaml file.
- docker-compose command by default take the file name as -f docker-compose.yaml, so no need to mention it explicitly.
- Artifacts can be used to share the files between the jobs. "Cache" feature can be used to resue dependencies(files) across different pipelines(as the depencies are downloaded to the server where runner is running)

Artifacts Vs Cache:

- Job Artifacts get uploaded and saved on the Gitlab server
- User artifacts to pass intermediate build results between the stages

- Use Cache for dependencies, like packages you download from the internet
- Cache is stored on the Gitlab runner. So if the job runs on the same runner, they can re-use the local cache on the server
- If you have many runners configured (like 100 runners) then "Cache" may not be efficient so much as "Cache" need to be created on each Runner's server
- In case if you have many runners , then you can configure "Distributed Cache" (i.e., Cache stored in AWS S3 bucket). A downside of this approach is again we need to download it from the internet but still it is faster to download 1 zip file, instead of each dependency separately. We can define a cache for each job.
- configure cache(in .gitlab-cl.yml)
    cache:
        key: my-cache
        paths:
            - .config    #where this job expect to have the cache
        policy: pull-push    # default policy
    Example:
        run_lint_checks:
          stage: test
          image: node:17-alpine3.14
          tags:
              - dock
              - windows
          before_script:
              - cd app
              - npm install
          script:
            - echo "running lint checks"
          cache:
            key: ${CI_COMMIT_REF_NAME}
            paths:
              - app/node_modules
            policy: pull
- The above config is used for two different purposes
    1) To generate the cache,when it runs for the first time
    2) To download the cache (2nd the time the pipeline runs)
    
- cache policies: "pull-push" (download the cache first and then update the same at the end). "pull" only download the cache, used when we have many jobs executing in parallel that use the same cache.
- You job should never depend on a Cache to be available. Caching is an optimization, but it isn't guaranteed to always work.
- If your jobs are running on a gitlab runner with docker executor , cache will be created inside a docker container, when jobs gets completed gitlab runner will remove the docker container. So, we need to persist the data, we have to configure "docker volume", which will replicate the data in the container on the "host".  
    gitlab-runner/config.toml
    [[runners]]
    cache_dir = "/cache"
    
    [[runners.docker]]
    vaolumes = ["/cache"]
- Clear the cache: we can do it in two ways:
    1) Manually with "Clear runner caches" from git lab pipelines UI.
    2) Change the value for the cache key in yml file.

Note: old cache will not be deleted physically on the disk. Instead cache name will be updated and used. To completely delete the cache, you can manually delete the files from the Runner storage.

Check "Cache" on the disk:
- docker volume ls                                        #To get volumes
- docker volume inspect runner-9fp7zh4n-project-******    #To get
volume mountpoint

- What is a CI/CD template? -->Gitlab engineer wrote these job templates. we can check the templates: https://gitlab.com/gitlab-org/gitlab-foss/tree/master/lib/gitlab/ci/templates
- Include "sast template" as follows: Possible include subkeys are: template,local,file, remote. For example, we can split one long .gitlab-ci.yml file into multiple files and "include" them. We can nest upto 100 includes.
    sast:
        stage: test
    
    include:
        - template: Job/SAST.gitlab-ci.yml
- We have other template as well, like "Pipeline template"-which provides an end-to-end CI/CD workflow. Its not included in another main pipeline configuration, but rather used by itself.

- "extends" attribute is used to reuse configuration sections. "extends" supports 11 levels of inheritance, but best practice is to not use more than 3 levels.
- Job inherits configuration from the other job(probably a hidden job)
- How to prevent a job from execution? (or) how to hide a job ? - we can do that by prefixing job with a dot(.) (ex: .deploy). If you start the job with a dot its not processed by GitLab CI/CD.

- To configure manual approval for a pipeline job --> use "when: manual", so that the job does't run, unless a user starts it manually. We will use this option commonly for production deployments.
- By default, specific Runners are locked to its project. We can edit the runners configuration and make them avialble for the projects we wish.

Unlock/Use Runners from other projects in Gitlab:

1. Go to the project where the runner is running and edit --> uncheck "lock" checkbox & save(so that we can enable it for other projects)
2. Restart the runner (.\gitlab-runner.exe restart)
3. Now, goto target project settings-->CI/CD-->Runners
    - you should be able to see the Runner (that we edited in step 1)
    - Click on "Enable for this project"

- As of now, In Gitlab few variables can not get extended in certain places. one of such places is, "only"
    only:
      changes:
        - "frontend/**/*"
- COMPOSE_PROJECT_NAME environment variable is used to chage docker container name:
        Container name: <project name>_<service name>_<index>
        By default, project name=current folder name(for unix, ubuntu)
        
        Example: export COMPOSE_PROJECT_NAME=${MICRO_SERVICE} (container will be name like frontend_app_1)
- docker-compose creates a new docker network on every docker-compose up
- by default: the network name is based on the name of the directory the compose file resides.(example: file is in /src directory, then the network name will be "src_default".)
-If we are using "COMPOSE_PROJECT_NAME" it will also change the prefix name of network service along with docker containter name.(compose_project_name=frontend, netwrok name will be frontend+default "frontend_default", and container name will be "frontend_app_1").
-By default, Containers in different network can't communicate with each other(because they will in different network).
 
-one of the disadvantages of "polyrepo" (individual service repo) is repeating the same pipeline configuration.
-We can use "job templates" to get rid of code duplication (comparable to jenkins shared library).
-There are two types of templates: pipeline templates & job templates. pipeline templates: provides an end-to-end CI/CD workflow. Should be used by itself in projects, with no other .gitlab-ci.yml file.Job templates: provides specific jobs, will be included to an existing CI/CD workflow.

Important criterion when writing templates:

Should be generic and reusable
you should make the templates customizable (i.e., pass custom parameters to make it specific to our project needs)
-----------
-We can have custom configuration(customize job) on top of the job code that is defined in the temlate job. We have to note that, the custom job will be overwrite the code of the same job in the template. So if we want to execute an extra code along with the code mentioned in the template job, we have to include the template job code as well in the custom job, as the custom job/code will overwrite the template job/code. In a nutshell, the custom job/code will overwrite the template job/code.
-Use "include:file" & "include:project" - to include files from another private project on the same Gitlab instance.
Example:
    include:
      - project: group/project_path/ci-templates
        ref: main         #to specify specific branch/commit
        file:
          - build.yml
          - deploy.yml

About "include"

-To include external YAML files
"local" - reference from same repository
"file" - Reference from another project (same Gitlab instance)
"remote" - include from a different location(full URL necessary)
"template" - include Gitlab's templates

Kubernetes commands:

kubectl cluster-info
kubectl create namespace my-micro-service
kubectl get namespaces
kubectl create serviceaccount cicd-sa --namespace=my-micro-service
kubectl get serviceaccount -n my-micro-service
kubectl get serviceaccount cicd-sa -n my-micro-service -o yaml #to get yaml file output

cicd-role.yml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: my-micro-service
  name: cicd
rules:
- apiGroups: [""] #indicates the core API group
  resources: ["pods", "services", "secrets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["extensions", "apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
kubectl apply -f cicd-role.yml
kubectl create rolebinding cicd-sa-rb --role=cicd-role --serviceaccount=my-micro-service:cicd-sa --namespace=my-micro-service


-To copy kubeconfig file
kubectl get secret cicd-sa-token-4fkdd -n my-micro-service -o yaml    #this will output token

By default, secrets are not encrypted in kubernetes. The token in the above command output will be in base64 encoded format, we need to decrypt it before we copy the token into our custom kubeconfig file.

echo "token_value"|base64 -D    #To base64 decrypt

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: $MICRO_SERVICE
  namespace: my-micro-service
spec:
  selector:
    matchLabels:
      app: $MICRO_SERVICE
    replicas: $REPLICAS
    template:
      metadata:
        labels:
          app: $MICRO_SERVICE
      spec:
        imagePullSecrets:        #used to get authenticate
        - name: my-registry-key
        containers:
        - name: $MICRO_SERVICE
          image: $IMAGE_NAME:$IMAGE_TAG
          ports:
          - containerPort: $SERVICE_PORT

service.yaml

apiVersion: v1
kind: Service
metadata:
  name: $MICRO_SERVICE
  namespace: my-micro-service
spec:
  selector:
    app: $MICRO_SERVICE
  ports:
    - protocol: TCP
      port: $SERVICE_PORT
      targetPort: $SERVICE_PORT
If we have many services that are using same kubernetes manifest files, it is better to place kubernetes manifest files(Generic & parameterized) in a separate git repository and reference them in the code.

-Set endpoints and other configuration from outside instead of hardcoding it in Dockerfile
-ConfigMap (for non-confidential data) [OR] secret (for sensitive data) in k8s.

deploy-k8s.yml

deploy:
  stage: deploy
  before_script:
    - export MICRO_SERVICE=${MICRO_SERVICE}
    - export IMAGE_NAME=$CI_REGISTRY_NAME/microservice/$MICRO_SERVICE
    - export IMAGE_TAG=$SERVICE_VERSION
    - export SERVICE_PORT=$SERVICE_PORT
    - export REPLICAS=$REPLICAS
    - export KUBECONFIG=$KUBE_CONFIG
  script:
    - kubectl create secret docker-registry my-registry-key --docker-server=$CI_REGISTRY --docker-username=$GITLAB_USER --docker-password=$GITLAB_PASSWORD -n my-micro-service --dry-run=client -o yaml|kubectl apply -f -
    #- kubectl apply -f kubernetes/deployment.yaml
    #- kubectl apply -f kubernetes/service.yaml
    - envsubst < kubernetes/deployment.yaml | kubectl apply -f -
    - envsubst < kubernetes/service.yaml | kubectl apply -f -
Note: Since we are deploying images from a private docker registry(gitlab) kubernetes need access to it. So we need an equivalent of "docker login". In kubernetes, we have dedicated secret type(i.e., docker-registry): Docker config secrets, that stores the credentials for accessing a container image registry.

We are not using $CI_REGISTRY_USER and $CI_REGISTRY_PASSWORD here because these details will not be available once the job is completed. So we need more persistant way to store credentials. So we are creating custom variables (masked) for credentials.

If we are running above secret create command(kubectl create )more than onetime,it will fail because the secret already exists. So inorder to fix and avoid that error, we can write the command and convert it to yaml file and pass it to kubectl apply command, so that even if we run multiple times it won't fail.
----------
kubectl get service -n my-micro-service
kubectl get deployment -n my-micro-service
kubectl get pod -n my-micro-service
kubectl port-forward service/frontend -n my-micro-service 3000:3000


Issues Observed:

1) Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock:
Workaround:
sudo usermod -aG docker gitlab-runner
sudo service docker restart
2) Gitlab bug:
symptoms:
chmod: unrecognized option '-----BEGIN'
ERROR: Job failed: exit status 1

Workaround:
Since gitlab is taking content of PRIVATE_KEY instead of file path. We need to create one more file by echoing the content(in the current job) and then refer that file in our job. We need to use 'sed' command to properly format it as follows:

- echo "${PRIVATE_KEY} |  sed -e "s/-----BEGIN RSA PRIVATE KEY-----/&\n/" -e "s/-----END RSA PRIVATE KEY-----/\n&" -e "s/\S\{64\}/&\n/g"

3) Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/auth": dial unix /var/run/docker.sock: connect: permission denied
Workaround:
    sudo groupadd docker
    sudo usermod -aG docker ${USER}
    sudo chmod 666 /var/run/docker.sock
Tags: In the actual course you can find more details about what is the difference between jenkins and gitlab ci/cd, GitLab CI/CD Vs Azure Pipelines, gitlab, gitlab-runner, gitlabci,gitlapapi, gitlab pricing, gitlab docker, gitlab ssh key,gitlab stock,gitlab vs github, gitlab api,gitlab ssh,gitlab artifacts, gitlab login, gitlab tutorial,gitlab training,gitlab cicd,gitlab download, install gitlab runner and much more.

──────── Credits to: TechWorld with Nana ────────

DISCLAIMER

The purpose of sharing the content on this website is to Educate. The author/owner of the content does not warrant that the information provided on this website is fully complete and shall not be responsible for any errors or omissions. The author/owner shall have neither liability nor responsibility to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the contents of this website. So, use the content of this website at your own risk.

This content has been shared under Educational And Non-Profit Purposes Only. No Copyright Infringement Intended, All Rights Reserved to the Actual Owner.

For Copyright Content Removal Please Contact us by Email at besttechreads[at]gmail.com

Post a Comment

Previous Post Next Post