Commit 16ad0742 authored by Sabyasachi Mondal's avatar Sabyasachi Mondal
Browse files

Update README.md

parent bfccbc15
...@@ -43,18 +43,22 @@ We intend to implement the following: ...@@ -43,18 +43,22 @@ We intend to implement the following:
*2.b Synchronized operation between packets of each stream which is essential for processing multiple togather.* *2.b Synchronized operation between packets of each stream which is essential for processing multiple togather.*
We try to read each row in the image as a pack of 3 streams process it in 2 seperate block and return the output as an array. We try to read each row in the image as a pack of 3 streams process it in 2 seperate block and return the output as an array.
![Schematic streaming rows and output](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/HLSolution.JPG "Schematic streaming rows and output") ![Schematic streaming rows and output](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/HLSolution.JPG "Schematic streaming rows and output")
This would mean we can store real-time data in frames and feed them continously from our python code. The processing blocks consists of a 2x2 array each and they are the convolution weights added to our stream of data and we return the output. This would mean we can store real-time data in frames and feed them continously from our python code. The processing blocks consists of a 2x2 array each and they are the convolution weights added to our stream of data and we return the output.
![Convolution on streaming row](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/RobertCross.JPG "Convolution on streaming row") ![Convolution on streaming row](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/RobertCross.JPG "Convolution on streaming row")
(DMA1 + DMA2) streams are processed in PU1 and (DMA2 + DMA3) streams in PU2. However becuase Robert's convolution algorithm needs data to be processed in a 2x2 array they must enter and get processed in Synchronized manner. (DMA1 + DMA2) streams are processed in PU1 and (DMA2 + DMA3) streams in PU2. However becuase Robert's convolution algorithm needs data to be processed in a 2x2 array they must enter and get processed in Synchronized manner.
[CPU FPGA interconnection and data transfer]('https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/CPU_FPGA.JPG')
![CPU FPGA interconnection and data transfer](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/CPU_FPGA.JPG "CPU FPGA interconnection and data transfer")
On the higher level the interaction between CPU and FPGA looks like the schematic shown above. On the higher level the interaction between CPU and FPGA looks like the schematic shown above.
We use two blocks to process the streams but that doesnot mean we use one thread we basically dont wait for Nth set of data to be processed before we can start processing N+1 data. Since the convolution algorithm does not wait for processing it can start to read and process the next N+1 set of data from the stream as soon as Nth set has been read. It looks something like this due to loop unrolling and leads to parallel processing. We use two blocks to process the streams but that doesnot mean we use one thread we basically dont wait for Nth set of data to be processed before we can start processing N+1 data. Since the convolution algorithm does not wait for processing it can start to read and process the next N+1 set of data from the stream as soon as Nth set has been read. It looks something like this due to loop unrolling and leads to parallel processing.
[Unravelling of streams in loop and parallel processing]('https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/Parallel_process.JPG')
![Unravelling of streams in loop and parallel processing](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/Parallel_process.JPG "Unravelling of streams in loop and parallel processing")
## What we achieved and the caveat : ## What we achieved and the caveat :
<b>*We intended to build a architechture that can process multiple streams and process them in same parallel level and we were sucessful.*</b> <b>*We intended to build a architechture that can process multiple streams and process them in same parallel level and we were sucessful.*</b>
...@@ -65,6 +69,8 @@ It is not very suitable for image processing tasks as arrays stored in memory do ...@@ -65,6 +69,8 @@ It is not very suitable for image processing tasks as arrays stored in memory do
<b>*CPU Average for images was at 10s and FPGA at about 6s*</b> <b>*CPU Average for images was at 10s and FPGA at about 6s*</b>
![Speed comparison in single images](https://mygit.th-deg.de/sm11312/fpga_final_project/-/raw/main/Speed_table.JPG "Speed comparison in single images")
## Future scope ## Future scope
*This is a new idea and has no previous references except implementaton guides.* *This is a new idea and has no previous references except implementaton guides.*
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment