Commit 801d7e0e authored by Sabyasachi Mondal's avatar Sabyasachi Mondal
Browse files

README.md pre-final commit

parent c9245bed
# FPGA_final_project
#### Group4: Sabyasachi Mondal , Ravi Yadav
fpga vs cpu performance comparison and fpga streamlining for computation intensive tasks
fpga for streamlining of computation intensive tasks. In this case we take an hyperspectral image which is generally analysed by satellites or drones mostly consisting of single band image data. This can be used for both maritime and vehicular navigation.
# Overview
We want to use FPGA for implementing an algorithm in hardware to perform computation more effeciently. CPU hardware is non-flexible so the code runs using the same set of registers and ALU , we cant optimize the harware as per our code. Our objective here is to harware a processing unit (something smilar to a flexible ALU using the CLBs) in the FPGA using High level code.
......@@ -11,9 +11,7 @@ CPUs are known for their general purpose use, the same GPUs can power all kinds
In FPGA for example may implement multiple multipliers or registers to work in parallel or in specifc order on the hardware level if we want. Depending on the kind of data we would receive we can implement an hardware that can entirely process the exact type of data much faster.
For application specific needs like signal processing the CPU takes help of the same compilation techniques and the same Machine level instructions which cant be optimized except for designing better software at the high level / mid level.
But We can break the same stereotype and as software designers develop our own algorithms bottom up from register levels to a high level code (python for example), which may prove immensely powerful for the task specific algorithm.
But We as software designers can develop our own algorithms bottom up from register levels to a high level code (python for example), which may prove immensely powerful for the task specific algorithm. In our case we use python as a host to drive our fpga.
# Objective
......@@ -25,7 +23,7 @@ In this case we are going to use the FPGA to implement a processing unit in hard
*1.a Implement image inversion and build / test IP*
*1.b Implement interactive image layer extraction / exclusion using modified watershed algorithm.*
*1.b Implement image layer extraction using modified convolution (Robert's operator).*
and
compare how CPU performs in comparision to our FPGA hardware that is exactly wired up to work on the kind of data we expect to provide as input.
......@@ -51,28 +49,34 @@ This is how a typical openCV resizer works:
We will notice this further if we study the resizer code that in the 2d image is fed to our DMA and internally the whole image is read row by row , col by col. Image array size is static becuase we are have finite space in FPGA.
This may be made more efficient and robust (accomodating any image width) if by implementing the following changes:
This may be made more efficient and robust (accomodating any image width and video) if by implementing the following changes:
1. Multichannel image operation where we use parallel threads for processing. Each of this processing an logic entity (utilizing multiple CLBs) is expected to be faster.
2. By chunking and sending data in packets fromour high level code we can also ensure that our FPGA can process an image much larger than it's own memory or DMA allocation space.
3. Creating un-rolled loop for read write operations along with function calls.
We use two streams of data in each process with it's own processing unit in our IP , which can be schematically represented in:
<image for our Implementation>
In the background extraction technique we use a modified form of the watershed algorithm to suit different layers of the image with a similar range and intensity of pixels, so we have a customizable layer to extract.
<Image modified watershed>
In the background extraction technique we use a modified form of convolution to extract layer / feature to from the image for example IR bands which can be applied as nightvision references for navigations.
<Image modified convolution>
# Tasks
The Tasks and maximum estimated time:
The Tasks and maximum actual time:
1. Problem statement and brainstorming for project selection : *24 hrs*
2. Design a basic model and build overlay : *4 hrs*
3. Python code adjustment and integration : *3 hrs*
4. Plan next stage of overlay design : *2 hrs*
2. Design a basic model and build overlay : *6 hrs*
3. Python code adjustment and integration : *4 hrs*
4. Plan next stage of overlay design : *4 hrs*
5. Develop algorithm for FPGA using C++ : *4 hrs*
6. Optimize code and add synchronization of multiple channels : *24 hrs*
7. Implement block diagram : *4 hrs*
8. Upload code and test in IPy notebook : *3 hrs*
# Resources used and Future project topics
#### Resources used
0. Images: https://serc.carleton.edu/earth_analysis/image_analysis/introduction/day_4_part_2.html
1. Image segmentation : https://theailearner.com/2020/11/29/image-segmentation-with-watershed-algorithm/
2. Operation with stream: https://www.xilinx.com/html_docs/xilinx2020_2/vitis_doc/hls_stream_library.html#ivv1539734234667__ad398476
3. Stream Interface : https://www.xilinx.com/html_docs/xilinx2020_2/vitis_doc/managing_interface_synthesis.html#ariaid-title32
......@@ -108,3 +112,5 @@ DMA size must be lesser than 16383 so we cant feed very large datasets directly
WARNING: [HLS 200-786] Detected dataflow-on-top in function 'color_filter' (../project_3/color_filter.cpp:45) with default interface mode 'ap_ctrl_hs'. Overlapped execution of successive kernel calls will not happen unless interface mode 'ap_ctrl_chain' is used (or 'ap_ctrl_none' for a purely data-driven design).
Resolution: For help on HLS 200-786 see www.xilinx.com/cgi-bin/docs/rdoc?v=2020.2;t=hls+guidance;d=200-786.html
DMA Stuck and not reponding
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment