<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Hadi]]></title><description><![CDATA[Hi! I’m Hadi, I like computer science and art. Feel free to check my videos, read my blog posts, or reach out! Thanks for stopping by!]]></description><link>https://hadicya.dev</link><generator>RSS for Node</generator><lastBuildDate>Thu, 16 Apr 2026 04:36:09 GMT</lastBuildDate><atom:link href="https://hadicya.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Portfolio]]></title><description><![CDATA[Interdimensional Gas Station

Experience a sci-fi narrative where you run a gas station on a drifting asteroid in the infinite void. Service increasingly complex alien ships, upgrade and expand your gas station's operations, and shape the fate of the...]]></description><link>https://hadicya.dev/portfolio</link><guid isPermaLink="true">https://hadicya.dev/portfolio</guid><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Wed, 04 Sep 2024 00:54:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725411192687/efd8f0e8-c72e-4630-b508-ec19d7bf55ba.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725394251985/5b13716f-4c9a-4c75-9bc9-c957a6259a7d.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-interdimensional-gas-station">Interdimensional Gas Station</h1>
<blockquote>
<p>Experience a sci-fi narrative where you run a gas station on a drifting asteroid in the infinite void. Service increasingly complex alien ships, upgrade and expand your gas station's operations, and shape the fate of the galactic stage through your choices!</p>
</blockquote>
<p>Interdimensional Gas Station is my most recent project. I serve as the main technical lead in this project, primarily focusing on game systems and game production.</p>
<ul>
<li>Architected and implemented robust backend systems and models, such as random character and ship generation, global game state, dialogue, user data reading/writing, etc, facilitating data-driven gameplay mechanics and ensuring integration with front-end components.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725410514621/53856ea4-e34c-4f19-8269-60d5ac082a20.gif" alt class="image--center mx-auto" /></p>
<ul>
<li>Developed a node-based story dialogue system for immersive storytelling, alongside a GUI in Python for reading/writing stories, increasing productivity significantly for both technical and non-technical team members.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725396303421/a79c9863-94f1-4686-9478-aa0ce125b4a9.gif" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725408437866/40eeb290-f6e6-4c87-be61-bc030d09a94a.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Created 3D Shader for pixel-perfect orthographic rendering, allowing 3D artists to create and integrate 3D models in an aesthetically 2D manner.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725394045683/8fdad650-a87f-4b64-8898-cc345f7ebfef.gif" alt class="image--center mx-auto" /></p>
<ul>
<li>Lead the project, constantly communicating with team members of different disciplines, working with localization agencies, and coordinating with freelancers across the globe.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725410712137/c90c5cf5-6e7d-4027-acb1-edc3ba5d44b7.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725410718762/ced5478f-46c8-4ac5-b742-6881dac501f9.png" alt class="image--center mx-auto" /></p>
<center>
<iframe src="https://store.steampowered.com/widget/3104980/" width="646" height="190"></iframe>
</center>

<hr />
<h1 id="heading-c-game-engine">C++ Game Engine</h1>
<p><em>C++, SDL2, lua, LuaBridge, Box2D, Vulkan</em></p>
<ul>
<li>Developed a physics-capable 2D component architecture game engine utilizing <strong>SDL2</strong>, <strong>Box2D</strong>, and <strong>lua</strong>.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725399997672/e53cba2b-693b-4b2f-98c0-610a61ae39c9.gif" alt class="image--center mx-auto" /></p>
<ul>
<li>Embedded lua scripting capabilities with <strong>LuaBridge</strong>, allowing developers to use exposed C++ API for rendering, playing audio, simulating physics, loading models, etc.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725400091540/37b6525d-dbd5-4beb-b50d-f51b3b7d5ecd.gif" alt class="image--center mx-auto" /></p>
<ul>
<li>Created experimental 3D engine API rendering with <strong>Vulkan</strong>, allowing developers to interact with 3D space.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725400452705/35287888-5542-421d-b944-dcf56c6c3b58.gif" alt class="image--center mx-auto" /></p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/HadiCya/ShadowEnginePublic">https://github.com/HadiCya/ShadowEnginePublic</a></div>
<p> </p>
<hr />
<h1 id="heading-demo-racing">Demo Racing</h1>
<p><em>Javascript, Node.js, Express.js, socket.io, Phaser.js, Matter.js, VPS, Nginx</em></p>
<ul>
<li><p>Developed a <strong>Node.js</strong> and <strong>Express.js</strong> server for a real-time racing multiplayer game built in <strong>Phaser.js</strong> and real-time physics engine <strong>Matter.js</strong>, with client-server architecture integrated with <strong>socket.io</strong>.</p>
</li>
<li><p>Spun a <strong>VPS</strong> on <strong>Ubuntu</strong>, hosting the app with <strong>Nginx</strong> reverse proxy, and created a <strong>bash</strong> redeployment script.</p>
</li>
<li><p>Established CI/CD protocols for continuous local testing and improvement.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725401996938/c7665760-82f9-4d0c-a19d-5c2f2f106c5a.png" alt class="image--center mx-auto" /></p>
<ul>
<li><p>Led engineering, managing over ten developers, designers, musicians, and artists with varying skill levels during weekly meetings throughout the school year, using <strong>Git</strong> and <strong>Jira</strong>. Established CI/CD protocols</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725402007637/44b8e17f-09d7-470b-a1d4-6fc06b8377e3.jpeg" alt class="image--center mx-auto" /></p>
</li>
</ul>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/HadiCya/DemoRacingJS">https://github.com/HadiCya/DemoRacingJS</a></div>
<p> </p>
<hr />
<h1 id="heading-shadow-sudoku">Shadow Sudoku</h1>
<p><em>Flutter, riverpod</em></p>
<ul>
<li>Led the development of a <strong>Flutter</strong> app released on the Apple App Store, conducting user research with <strong>TestFlight</strong>.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725409197567/137d5836-0cec-464d-8bc6-9d2066c1b5f0.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Integrated asynchronous functions to handle a sudoku <strong>API</strong> in <strong>Dart</strong>, and created game state functions for <strong>provider</strong> management, reactively caching and binding data with <strong>riverpod</strong>.</li>
</ul>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://apps.apple.com/mt/app/shadow-sudoku/id6462602241">https://apps.apple.com/mt/app/shadow-sudoku/id6462602241</a></div>
<p> </p>
<hr />
<h1 id="heading-runaway-robots">Runaway Robots</h1>
<p><em>C#, Unity, Unity Gaming Services</em></p>
<ul>
<li>Developed and released a 3D mobile game on the Apple App Store using <strong>Unity</strong>, implementing engaging gameplay features and optimizing performance for seamless user experiences on <strong>iOS</strong> devices.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725401593533/47ce9f83-bcc6-430a-9d09-c792e745e55a.gif" alt class="image--center mx-auto" /></p>
<ul>
<li>Integrated <strong>Unity Gaming Services</strong> to facilitate in-game advertising and leaderboards, enhancing player engagement and competitive elements within the game.</li>
</ul>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://apps.apple.com/mt/app/runaway-robots/id6478119534">https://apps.apple.com/mt/app/runaway-robots/id6478119534</a></div>
<p> </p>
<hr />
<h1 id="heading-bitmoney">BitMoney</h1>
<p><em>SwiftUI, Sketch</em></p>
<ul>
<li>Led the development and launch of an iOS financial gaming app on the App Store, and used TestFlight to conduct user testing and research, incorporating feedback into app design.</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725409037000/d55225c2-b037-4e26-b482-6dc38644ea73.png" alt class="image--center mx-auto" /></p>
<ul>
<li>Managed project timelines and broke down important tasks for a team of five into a backlog of prioritized user stories. Delegated tasks based on member's skills, such as SwiftUI and Sketch.</li>
</ul>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://apps.apple.com/mt/app/bitmoney-financial-game/id1671570766">https://apps.apple.com/mt/app/bitmoney-financial-game/id1671570766</a></div>
]]></content:encoded></item><item><title><![CDATA[<Part 3> Make spinning 3D shapes in SDL2 and OpenGL]]></title><description><![CDATA[Up to now, we've created a window in SDL2 using OpenGL, built a Mesh, and began writing shaders in GLSL to get a stretched-out square. In this final part, we will explain rendering in 3D, making things spin, and the Linear Algebra behind it!
DISCLAIM...]]></description><link>https://hadicya.dev/part-3-make-spinning-3d-shapes-in-sdl2-and-opengl</link><guid isPermaLink="true">https://hadicya.dev/part-3-make-spinning-3d-shapes-in-sdl2-and-opengl</guid><category><![CDATA[computer graphics]]></category><category><![CDATA[linear algebra ]]></category><category><![CDATA[openGL]]></category><category><![CDATA[3d rendering]]></category><category><![CDATA[#GLM]]></category><category><![CDATA[SDL2]]></category><category><![CDATA[C++]]></category><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Tue, 19 Dec 2023 05:54:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1702964702542/1fde9b34-d43b-4b03-9762-4ac53218830d.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Up to now, we've created a window in SDL2 using OpenGL, built a Mesh, and began writing shaders in GLSL to get a stretched-out square. In this final part, we will explain rendering in 3D, making things spin, and the Linear Algebra behind it!</p>
<p><em>DISCLAIMER: This walkthrough assumes knowledge from Part 1 and Part 2 and a working knowledge of C++ and how to compile it.</em></p>
<p>GitHub Repo: <a target="_blank" href="https://github.com/HadiCya/spinning_shapes">https://github.com/HadiCya/spinning_shapes</a></p>
<p>YouTube Version:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/ac6mf05O_qw">https://youtu.be/ac6mf05O_qw</a></div>
<p> </p>
<h1 id="heading-getting-mathematical">Getting Mathematical!</h1>
<p>We're going to be running these calculations in our <code>main.cpp</code> file, and we'll get started by updating our includes:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;SDL.h&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"mesh.h"</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"loadShader.h"</span></span>

<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glm/glm.hpp&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glm/gtc/matrix_transform.hpp&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glm/gtc/type_ptr.hpp&gt;</span></span>
</code></pre>
<p>Here, we will include the necessary headers for linear algebra operations using the GLM library. This enables the main application to work with vectors and matrices required for 3D transformations.</p>
<p>We're going to now create and configure matrices for transforming the object in 3D space. We're going to write this code after we call <code>glClear()</code> in our while loop.</p>
<h2 id="heading-model-matrix">Model Matrix</h2>
<pre><code class="lang-cpp">glm::mat4 model = glm::mat4(<span class="hljs-number">1.0f</span>);
model = glm::rotate(model, glm::radians(<span class="hljs-number">50.0f</span>), glm::vec3(<span class="hljs-number">0.5f</span>, <span class="hljs-number">1.0f</span>, <span class="hljs-number">0.0f</span>));
</code></pre>
<p>This line of code creates a rotation matrix. In computer graphics, rotation matrices are used to rotate objects around a specified axis.</p>
<h3 id="heading-lets-get-into-the-math-behind-the-code">Let's get into the math behind the code:</h3>
<p>Initially, we define <code>model</code> as an Identity matrix, which, in linear algebra, is practically 1.</p>
<p>$$model = \begin{bmatrix} 1 &amp; 0 &amp; 0 &amp; 0 \\ 0 &amp; 1 &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; 1 &amp; 0 \\ 0 &amp; 0 &amp; 0 &amp; 1 \end{bmatrix}$$</p><p><code>glm::vec3(0.5f, 1.0f, 0.0f)</code> defines the axis of rotation. Here, it's a vector pointing partially up and partially towards the x-axis:</p>
<p>$$\vec{u} = \left&lt;0.5, 1, 0 \right&gt;$$</p><p>We need to normalize the axis:</p>
<p>$$\vec{u'} = \left&lt; \frac{0.5}{\sqrt{0.5^2 + 1^2}}, \frac{1}{\sqrt{0.5^2 + 1^2}}, 0 \right&gt;$$</p><p><code>glm::radians(50.0f)</code> converts 50 degrees into radians, as trigonometric functions in GLM use radians:</p>
<p>$$\theta = \text{radians}(50) = \frac{50 \pi}{180}$$</p><p>The <code>glm::rotate</code> function creates a 4x4 matrix that rotates points in 3D space around the given axis by the specified angle. The rotation matrix is derived from Rodrigues' rotation formula, combining the components of the axis vector and the sine and cosine of the angle.</p>
<p>$$R = \begin{bmatrix} \cos(\theta) + u_x^2 (1 - \cos(\theta)) &amp; u_xu_y(1 - \cos(\theta)) - u_z\sin(\theta) &amp; u_xu_z(1 - \cos(\theta)) + u_y\sin(\theta) &amp; 0 \\ u_yu_x(1 - \cos(\theta)) + u_z\sin(\theta) &amp; \cos(\theta) + u_y^2(1 - \cos(\theta)) &amp; u_yu_z(1 - \cos(\theta)) - u_x\sin(\theta) &amp; 0 \\ u_zu_x(1 - \cos(\theta)) - u_y\sin(\theta) &amp; u_zu_y(1 - \cos(\theta)) + u_x\sin(\theta) &amp; \cos(\theta) + u_z^2(1 - \cos(\theta)) &amp; 0 \\ 0 &amp; 0 &amp; 0 &amp; 1 \end{bmatrix}$$</p><p>When you plug in everything we've calculated thus far into Matrix <strong>R</strong>, the resulting matrix will be stored into <code>model</code>.</p>
<h2 id="heading-view-matrix">View Matrix</h2>
<pre><code class="lang-cpp">glm::mat4 view = glm::mat4(<span class="hljs-number">1.0f</span>);
view = glm::translate(view, glm::vec3(<span class="hljs-number">0.0f</span>, <span class="hljs-number">0.0f</span>, <span class="hljs-number">-3.0f</span>));
</code></pre>
<p>The view matrix is used to transform vertices from world space to camera (view) space. It's analogous to moving and orienting a camera within the scene.</p>
<p><code>glm::vec3(0.0f, 0.0f, -3.0f)</code> indicates that the view is being translated along the z-axis.</p>
<p>$$\vec{t} = \left&lt;0, 0, -3 \right&gt;$$</p><p>The <code>glm::translate</code> function constructs a 4x4 translation matrix. This matrix is an identity matrix with the translation vector added to the fourth column. It shifts the position of all points in the scene relative to the camera.</p>
<p>For a translation by a vector <strong>t,</strong> the translation matrix <strong>T</strong> is:</p>
<p>$$T = \begin{bmatrix} 1 &amp; 0 &amp; 0 &amp; t_x \\ 0 &amp; 1 &amp; 0 &amp; t_y \\ 0 &amp; 0 &amp; 1 &amp; t_z \\ 0 &amp; 0 &amp; 0 &amp; 1 \end{bmatrix}$$</p><p>When you plug in vector <code>t</code> into Matrix <strong>T</strong>, the resulting matrix will be stored into <code>view</code>.</p>
<h2 id="heading-projection-matrix">Projection Matrix</h2>
<pre><code class="lang-cpp">glm::mat4 projection = glm::perspective(glm::radians(<span class="hljs-number">60.0f</span>), <span class="hljs-keyword">float</span>(screen_width)/(<span class="hljs-keyword">float</span>)screen_height, <span class="hljs-number">0.1f</span>, <span class="hljs-number">100.0f</span>);
</code></pre>
<p>The projection matrix maps a 3D world into a 2D view, akin to how a camera lens focuses light onto a film.</p>
<p><code>glm::perspective</code> creates a frustum that defines the visible space. It scales the x and y coordinates of vertices based on their depth (z-value), causing more distant objects to appear smaller and creating a sense of depth.</p>
<p>The frustum in the context of the <code>glm::perspective</code> function is a crucial concept in 3D graphics, especially concerning the perspective projection matrix. Let's break down what a frustum is and how it works in the creation of a perspective projection:</p>
<h3 id="heading-what-is-a-frustum">What is a Frustum?</h3>
<ul>
<li><p>A frustum in 3D graphics is a geometric shape that resembles a pyramid with the top cut off. In the case of perspective projection, it's more specifically a truncated pyramid.</p>
</li>
<li><p>The frustum defines the portion of the 3D space that is visible through the camera. Only objects within this frustum will be rendered on the screen.</p>
</li>
</ul>
<h3 id="heading-components-of-the-frustum">Components of the Frustum:</h3>
<ol>
<li><p><strong>Near Plane</strong>: This is the closest plane to the viewer (camera). In your code, it's set to 0.1f units from the viewer. Objects closer than this plane are not visible or rendered.</p>
<p> $$z_{near} = 0.1$$</p>
</li>
<li><p><strong>Far Plane</strong>: This is the farthest plane from the viewer, set to 100.0f units in your code. Objects beyond this distance are also not visible or rendered.</p>
<p> $$z_{far} = 100.0$$</p>
</li>
<li><p><strong>Sides of the Frustum</strong>: The sides of the frustum are defined by the field of view (FOV). A wider FOV creates a wider frustum, capturing a broader view of the scene but potentially introducing more distortion (similar to a wide-angle lens in photography).</p>
<p> <code>glm::radians(60.0f)</code> specifies the vertical FOV in radians. A wider FOV can capture more of the scene but can also introduce more distortion.</p>
<p> $$fov_y = \text{radians}(60) = \frac{60 \pi}{180}$$</p>
</li>
<li><p><strong>Top and Bottom</strong>: The aspect ratio (width divided by height) determines the relative heights of the top and bottom planes of the frustum.</p>
<p> $$aspect = \frac{\text{screen width}}{\text{screen height}}$$</p>
</li>
</ol>
<h3 id="heading-role-in-perspective-projection">Role in Perspective Projection:</h3>
<ul>
<li><p>The purpose of the frustum in perspective projection is to define how 3D coordinates are projected onto the 2D plane of the screen.</p>
</li>
<li><p>The projection matrix, <strong>P,</strong> maps coordinates from this 3D frustum to a normalized 2D coordinate system. It scales the x and y coordinates of the vertices based on their z-value (depth), which makes objects appear smaller as they get farther away, creating a realistic depth perception.</p>
</li>
</ul>
<p>$$P = \begin{bmatrix} \frac{\cot{(\frac{fov_y}{2})}}{aspect} &amp; 0 &amp; 0 &amp; 0 \\ 0 &amp; \cot{(\frac{fov_y}{2})} &amp; 0 &amp; 0 \\ 0 &amp; 0 &amp; \frac{z_{\text{far}} + z_{\text{near}}}{z_{\text{near}} - z_{\text{far}}} &amp; \frac{2 \times z_{\text{far}} \times z_{\text{near}}}{z_{\text{near}} - z_{\text{far}}} \\ 0 &amp; 0 &amp; -1 &amp; 0 \end{bmatrix}$$</p><p>When you plug in everything we've calculated thus far into Matrix <strong>P</strong>, the resulting matrix will be stored into <code>projection</code>.</p>
<h1 id="heading-connecting-it-to-the-vertex-shader">Connecting it to the vertex shader</h1>
<p>Now, we need a way to pass these matrices into the vertex shader.</p>
<pre><code class="lang-cpp">        <span class="hljs-keyword">int</span> modelLoc = glGetUniformLocation(programID, <span class="hljs-string">"model"</span>);
        glUniformMatrix4fv(modelLoc, <span class="hljs-number">1</span>, GL_FALSE, glm::value_ptr(model));

        <span class="hljs-keyword">int</span> viewLoc = glGetUniformLocation(programID, <span class="hljs-string">"view"</span>);
        glUniformMatrix4fv(viewLoc, <span class="hljs-number">1</span>, GL_FALSE, glm::value_ptr(view));

        <span class="hljs-keyword">int</span> projectionLoc = glGetUniformLocation(programID, <span class="hljs-string">"projection"</span>);
        glUniformMatrix4fv(projectionLoc, <span class="hljs-number">1</span>, GL_FALSE, glm::value_ptr(projection));
</code></pre>
<p>We are going to pass the model, view, and projection matrices to the shaders. The <code>glUniformMatrix4fv</code> function is an OpenGL function that sets the value of a uniform matrix variable in the shaders. It takes the location of the uniform variable, the number of matrices (1 in this case), a flag for matrix transpose (set to <code>GL_FALSE</code>), and a pointer to the data.</p>
<p>In <code>vertex.glsl</code>, we're going to need to grab the variables that we've just defined:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#version 330 core</span>

layout (location = <span class="hljs-number">0</span>) in vec3 aPos;
uniform mat4 model;
uniform mat4 view;
uniform mat4 projection;

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">()</span> </span>{
    gl_Position = projection * view * model * vec4(aPos, <span class="hljs-number">1.0</span>);
}
</code></pre>
<p>When combining these matrices, the order of multiplication is crucial. The typical order in OpenGL for transforming a vertex <strong>v</strong> is:</p>
<ol>
<li><p>Model transformation (rotation in this case)</p>
</li>
<li><p>View transformation (translation)</p>
</li>
<li><p>Projection transformation</p>
</li>
</ol>
<p>So, the combined transformation matrix <strong>M</strong> is given by:</p>
<p>$$M = P \times T \times R$$</p><p>The final transformed vertex <strong>v</strong> is obtained by multiplying the vertex's homogeneous coordinate vector <strong>v</strong> with <strong>M</strong>:</p>
<p>$$v' = M \times v$$</p><p>This process transforms the vertex from its local space (model transformation) to world space (view transformation), and finally to normalized device coordinates (projection transformation), ready for rasterization and rendering on the screen.</p>
<p>Before we do that, we're going to need to update our <code>vertices</code> and <code>triangle</code> arrays in <code>mesh.cpp</code> to be a 3D cube:</p>
<pre><code class="lang-cpp">    GLfloat vertices[] = {
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">-0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">-0.5f</span>,
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">-0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">-0.5f</span>
    };
    GLint triangles[] = {<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>,
                          <span class="hljs-number">2</span>, <span class="hljs-number">6</span>, <span class="hljs-number">7</span>, <span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">3</span>,
                          <span class="hljs-number">6</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>, <span class="hljs-number">5</span>, <span class="hljs-number">7</span>,
                          <span class="hljs-number">4</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>, <span class="hljs-number">5</span>,
                          <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">7</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">5</span>,
                          <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>};
</code></pre>
<p>If you forgot how we did this last time, look at <a target="_blank" href="https://hadicya.dev/part-2-make-spinning-3d-shapes-in-sdl2-and-opengl#heading-creating-our-mesh-class">Part 2</a> for a more detailed explanation.</p>
<p>Let's go ahead and run this code! This is what my compilation command looks like now:</p>
<p><code>clang++ -std=c++20 main.cpp mesh.cpp loadShader.cpp ./glad/src/glad.c -o spinning_shapes -I/Library/Frameworks/SDL2.framework/Headers -I./glad/include -I./glm/ -F/Library/Frameworks -framework SDL2</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1702963259239/d55267a8-6e80-4b4b-b199-e6f635f63965.png" alt class="image--center mx-auto" /></p>
<p>If you see a 3D cube, congrats! We are close to making it spin.</p>
<h1 id="heading-making-it-spin">Making it spin!</h1>
<p>To make it spin, all we're going to need to do is update our <code>model</code> matrix to rotate with time instead of staying at 50 degrees.</p>
<p>In <code>main.cpp</code> update the code to:</p>
<p>Add <code>Uint32 lastUpdate = SDL_GetTicks();</code> under the declaration of <code>done</code></p>
<p>Add <code>Uint32 current = SDL_GetTicks();</code> as soon as the while loop begins</p>
<p>Add <code>float dT = (current — lastUpdate) / 1000.0f;</code> underneath the <code>glClear()</code> call</p>
<p>Add <code>*dT</code> to the rotate matrix in the <code>model</code> declaration so that it looks like:</p>
<p><code>model = glm::rotate(model, glm::radians(50.0f)*dT, glm::vec3(0.5f, 1.0f, 0.0f));</code></p>
<p>Simply, this code is going to calculate the difference in time between frames, and then multiply that difference by our rotation, so that it consistently rotates each frame.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1702964136947/d48c4176-e60d-4681-9518-3c25e3e661a8.gif" alt class="image--center mx-auto" /></p>
<p>Spinning cube. Speaks for itself.</p>
<h3 id="heading-bonus-spinning-rhombus">BONUS: Spinning Rhombus</h3>
<p>In <code>mesh.cpp</code> update the vertices and triangles to make a rhombus:</p>
<pre><code class="lang-cpp">    GLfloat vertices[] = {
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.0f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.0f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.0f</span>, <span class="hljs-number">-0.5f</span>,
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.0f</span>, <span class="hljs-number">-0.5f</span>,
        <span class="hljs-number">0.0f</span>, <span class="hljs-number">1.0f</span>, <span class="hljs-number">0.0f</span>,
        <span class="hljs-number">0.0f</span>, <span class="hljs-number">-1.0f</span>, <span class="hljs-number">0.0f</span>
    };

    GLint triangles[] = {
        <span class="hljs-number">0</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>, 
        <span class="hljs-number">1</span>, <span class="hljs-number">4</span>, <span class="hljs-number">2</span>, 
        <span class="hljs-number">2</span>, <span class="hljs-number">4</span>, <span class="hljs-number">3</span>, 
        <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">0</span>, 
        <span class="hljs-number">0</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>, 
        <span class="hljs-number">1</span>, <span class="hljs-number">5</span>, <span class="hljs-number">2</span>, 
        <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">3</span>, 
        <span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">0</span>};
</code></pre>
<p>Now when we run our code we get a Rhombus:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1702964401215/aa5d6968-656a-463a-8bac-bce97b17deea.gif" alt class="image--center mx-auto" /></p>
<p>I suggest trying to make your mesh to get a better understanding of 3D spaces and how memory management works.</p>
<h2 id="heading-wrapping-up"><strong>Wrapping Up</strong></h2>
<p>Throughout this series, we have embarked on an exciting journey into the world of 3D graphics programming. Starting from the basics of OpenGL and SDL initialization, we have explored fundamental concepts such as vertices, triangle arrays, and buffer management. Along the way, we’ve delved into the mathematical principles that make 3D rendering possible.</p>
<h2 id="heading-key-takeaways"><strong>Key Takeaways</strong></h2>
<ol>
<li><p>Understanding Vertices and Buffers: We understood the role that defining vertices and the order in which to draw them, and how buffers enable efficient storage and manipulation of these vertices on the GPU.</p>
</li>
<li><p>3D Transformations and Projection: Through matrix mathematics, we’ve seen how to translate, rotate, and scale objects in a 3D space. While OpenGL GLM handles this math for us, we took the opportunity to see what was happening behind the scenes. We also learned how perspective projection brings depth to our rendering.</p>
</li>
<li><p>Creating our Spinning Cube! By combining all of the elements we learned, we were able to render a spinning 3D cube, which is the first building block for creating far more complex 3D scenes.</p>
</li>
</ol>
<h2 id="heading-whats-next"><strong>What’s Next?</strong></h2>
<p>The door to 3D graphics programming is now wide open. With the foundation laid in this series, you can explore further topics such as:</p>
<ul>
<li><p>Texture Mapping: Adding textures to give surfaces realistic appearances.</p>
</li>
<li><p>Lighting and Shading: Implementing various lighting models to create a more lifelike scene.</p>
</li>
<li><p>3D Modeling and Animation: Importing and manipulating complex 3D models, and creating intricate animations.</p>
</li>
</ul>
<p>Whether you’re looking to build a game, create a simulation, make a movie, or simply experiment with 3D graphics, the skills you’ve acquired here will be a valuable asset.</p>
<p>Thank you for joining me on this adventure into 3D graphics programming. Keep experimenting, learning, and most importantly, having fun! Build your castle in the skies, your creativity and intellect will be enough :)</p>
<p>Happy coding!</p>
]]></content:encoded></item><item><title><![CDATA[Building Pong with MonoGame: A Step-by-Step Guide]]></title><description><![CDATA[In this tutorial, you will create the game Pong in MonoGame using C#. We will be able to play pong with a friend, and more importantly, learn how to build games in MonoGame.
MonoGame is an open-source, cross-platform game development framework capabl...]]></description><link>https://hadicya.dev/building-pong-with-monogame-a-step-by-step-guide</link><guid isPermaLink="true">https://hadicya.dev/building-pong-with-monogame-a-step-by-step-guide</guid><category><![CDATA[coding]]></category><category><![CDATA[monogame]]></category><category><![CDATA[pong]]></category><category><![CDATA[Game Development]]></category><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Sun, 08 Oct 2023 19:58:05 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1696749464952/51e93f2c-d4c2-4ccc-932a-4f9c3354b252.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this tutorial, you will create the game Pong in MonoGame using C#. We will be able to play pong with a friend, and more importantly, learn how to build games in MonoGame.</p>
<p>MonoGame is an open-source, cross-platform game development framework capable of building 2D and 3D games. It is built from the now-expired Microsoft XNA framework, which was popular back in the Xbox 360 days for creating games on the Xbox and PC.</p>
<p>As a game developer, using MonoGame might seem backward in the world of advanced game engines like Unity and Unreal Engine, which provide many tools and components for you that we'd otherwise have to program ourselves. However, building components of the engine that are specific to your game requires you to understand the code at a far deeper level, which will make you better at identifying programming patterns.</p>
<blockquote>
<p>In this tutorial, you will learn how to create a Pong game using MonoGame and C#. This tutorial covers setting up the environment, creating the Paddle and Ball classes, handling collisions, and displaying the score. A basic knowledge of object-oriented programming languages such as C# or C++ is assumed. By the end of this tutorial, you will have a fully functional Pong game to play with a friend and a deeper understanding of MonoGame's framework.</p>
</blockquote>
<p>Let's get started!</p>
<p><em>DISCLAIMER: This walkthrough assumes a basic knowledge of any OOP (Object Oriented Programming) Language, such as C# or C++.</em></p>
<p>GitHub Repo: <a target="_blank" href="https://github.com/HadiCya/Pong-MonoGame">https://github.com/HadiCya/Pong-MonoGame</a></p>
<p>YouTube Version:</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/Dq2lpY6vJoI?si=gLX1lBYcAnnA7Iyh"></iframe>

<h1 id="heading-what-youll-need">What you'll need</h1>
<p>Let's get set up. We'll need a couple of things to get started:</p>
<ul>
<li><p><a target="_blank" href="https://dotnet.microsoft.com/en-us/download/dotnet/6.0">.NET 6.0 SDK</a></p>
</li>
<li><p><a target="_blank" href="https://docs.monogame.net/articles/getting_started/0_getting_started.html">MonoGame</a></p>
</li>
</ul>
<p>First things first, make sure <a target="_blank" href="https://dotnet.microsoft.com/en-us/download/dotnet/6.0">.NET 6.0 SDK</a> is installed on your computer matching your operating system (Windows, MacOS, or Linux).</p>
<p>The recommended way to install MonoGame is by using Visual Studio 2022, because there is already an extension you can use directly in the IDE. Creating a project is a lot more simplified in this way. Follow this guide by <a target="_blank" href="https://docs.monogame.net/articles/getting_started/0_getting_started.html">MonoGame</a> to install it for your specific operating system. You will also be able to run it in the command line without an IDE, if you prefer Visual Studio Code, vim, or some other code editor.</p>
<p>Now that you have it installed, let's create a project.</p>
<p>If you're using <strong>Visual Studio 2022</strong>, create a new project, select the <code>MonoGame Cross Platform Desktop Application</code> option, and name the project Pong.</p>
<p><img src="https://docs.monogame.net/images/getting_started/vsmac-mg-new-2.png" alt="New Template" /></p>
<p>If you're using the <strong>command line</strong>, navigate to where you'd like to create a project and type:</p>
<p><code>dotnet new mgdesktopgl -o Pong</code></p>
<p>This will create a new dotnet project called Pong.</p>
<p>Now that you've created a MonoGame project, you'll see that a lot of files have been generated where you created the project. If you're using Visual Studio 2022, go ahead and run the program with the debug button or by pressing <code>F5</code>. If you're using the command line, type <code>dotnet run</code>. When you run the project, a blue window should pop up.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696712387568/4cd1c7f9-18af-40d5-8388-9ce55ba1d763.png" alt class="image--center mx-auto" /></p>
<p>If you see this, you're ready to get started with MonoGame!</p>
<h1 id="heading-preparing-the-code">Preparing the code</h1>
<p>The project that is automatically generated has a lot of files in it, so it may be overwhelming to look at first glance. But, we'll go through the important ones right now and try to understand how the code is structured. This is the general file structure:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696713885315/801e1c79-cf5d-4a78-97aa-fb2469a0cf75.png" alt class="image--center mx-auto" /></p>
<p>While there are ways we can reorganize that make sense for a project you'd be working on long term, right now we can use the default structure since we're just making Pong. That being said, the only file we will concern ourselves with (right now) is <code>Game1.cs</code> which is our game runs.</p>
<p>In <code>Game1.cs</code> we are defining the class <code>Game1</code>. Let's break it down:</p>
<ul>
<li><p><code>Game1()</code> is the constructor for our class. This is run when we create the instance of our game. The default code is just defining some basic things for the code to run.</p>
</li>
<li><p><code>Initialize()</code> is where we will initialize objects inside our game. When we create our paddle class, we will initialize our players here.</p>
</li>
<li><p><code>LoadContent()</code> is where we can load content, such as sprites, fonts, audio, and other content.</p>
</li>
<li><p><code>Update(GameTime gameTime)</code> is called every frame, and it is where we will update our game logic.</p>
</li>
<li><p><code>Draw(GameTime gameTime)</code> is where we will draw our objects to the screen.</p>
</li>
</ul>
<p>As you can see, everything is broken up into clear functions, and we're going to try and keep things clean in this manner while we create our game.</p>
<p>We're going to try and avoid running too much game logic in the <code>Game1</code> class, as we want to have clear roles for all the objects in the game. For this project, we can simplify the classes we will make as just <code>Paddle</code> and <code>Ball</code>. It may be more intelligent to organize more so there is more separation and more flexibility to change things about the game in the future, but for this tutorial, we're going to just worry about getting a working Pong.</p>
<p>Before we start coding, let's prepare the existing code for our game.</p>
<p>Create a file called <code>Globals.cs</code> so that we can store global variables for our game.</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">using</span> Microsoft.Xna.Framework.Content;
<span class="hljs-keyword">using</span> Microsoft.Xna.Framework.Graphics;

<span class="hljs-keyword">namespace</span> <span class="hljs-title">Pong</span>
{
    <span class="hljs-keyword">class</span> <span class="hljs-title">Globals</span>
    {
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> SpriteBatch spriteBatch;
        <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">int</span> WIDTH = <span class="hljs-number">640</span>, HEIGHT = <span class="hljs-number">480</span>;
    }
}
</code></pre>
<p>In our Globals class, we want to store some public global variables, so that we can access them from anywhere in our game. For now, we're going to store the width and height of our screen, as well as <code>SpriteBatch</code>, which is going to make drawing things to the screen possible.</p>
<p>Because we've created a global variable <code>spriteBatch</code>, we can replace all instances of the local private one with the global version. Go into your <code>Game1</code> class, and delete the <code>_spriteBatch</code> class variable that was automatically created.</p>
<p>Replace any mention of <code>_spriteBatch</code> in <code>Game1</code> with our newly created <code>Globals.spriteBatch</code></p>
<p>Let's also update our constructor to use the <code>WIDTH</code> and <code>HEIGHT</code> variables we defined earlier.</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Game1</span>(<span class="hljs-params"></span>)</span> {
    _graphics = <span class="hljs-keyword">new</span> GraphicsDeviceManager(<span class="hljs-keyword">this</span>);
    _graphics.PreferredBackBufferWidth = Globals.WIDTH;
    _graphics.PreferredBackBufferHeight = Globals.HEIGHT;
    Content.RootDirectory = <span class="hljs-string">"Content"</span>;
    IsMouseVisible = <span class="hljs-literal">true</span>;
}
</code></pre>
<p>The last thing we'll need to do is update our <code>Draw()</code> function:</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">protected</span> <span class="hljs-keyword">override</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Draw</span>(<span class="hljs-params">GameTime gameTime</span>)</span> {
    GraphicsDevice.Clear(Color.Black);

    Globals.spriteBatch.Begin();
    <span class="hljs-comment">// <span class="hljs-doctag">TODO:</span> Add your drawing code here            </span>
    Globals.spriteBatch.End();

    <span class="hljs-keyword">base</span>.Draw(gameTime);
}
</code></pre>
<p>We need to change the Color from <code>Color.CornflowerBlue</code> to <code>Color.Black</code> (unless you want your background to be some other color).</p>
<p>To draw game objects on the screen, start the drawing process by using the <code>Begin()</code> method on the <code>Globals.spriteBatch</code> object. This sets up everything needed to show our game elements. After finishing all drawing tasks, end the process by using the <code>End()</code> method on the <code>Globals.spriteBatch</code> object. This makes sure all drawing operations are done and the elements show up on the screen.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696723603318/cc71d14e-32c6-4865-b0f3-57f42f5668aa.png" alt class="image--center mx-auto" /></p>
<p>Not much has changed, but now it will be easier to create objects and draw to the screen. Next, we're going to create the <code>Paddle</code>.</p>
<h1 id="heading-creating-the-paddle">Creating the Paddle</h1>
<p>Let's go ahead and create a file called <code>Paddle.cs</code> and define our <code>Paddle</code> class. In this class, create a constructor, a <code>Update()</code> and a <code>Draw()</code> function.</p>
<p>We're also going to want to define a <code>Rectangle</code> object, which will represent the paddle.</p>
<p>This is how our class should look:</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">using</span> Microsoft.Xna.Framework;
<span class="hljs-keyword">using</span> Microsoft.Xna.Framework.Input;

<span class="hljs-keyword">namespace</span> <span class="hljs-title">Pong</span> {
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">Paddle</span> {
        <span class="hljs-keyword">public</span> Rectangle rect;

        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Paddle</span>(<span class="hljs-params"></span>)</span> {}
        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Update</span>(<span class="hljs-params">GameTime gameTime</span>)</span> {}
        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Draw</span>(<span class="hljs-params"></span>)</span> {}
    }
}
</code></pre>
<p>Now that we've set up the <code>Paddle</code> class, let's try just getting a rectangle on the screen.</p>
<p>In the constructor, let's define what <code>rect</code> is. The parameters for the MonoGame <code>Rectangle</code> class is: <code>Rectangle(int x, int y, int width, int height)</code></p>
<p>As you can see, we need to fill in the position of the rectangle, x and y, as well as the width and height.</p>
<p>In computer graphics, we use a different coordinate system than the usual one from math class. The top-left corner of the screen is the starting point <code>(0,0)</code>. The x-axis shows the horizontal position, and the y-axis shows the vertical position. This system makes it easier to work with graphics on screens and in programming.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696725164891/06d7e30e-c457-4a47-89ff-f366214ae89b.png" alt="Courtesy of Processing.org" class="image--center mx-auto" /></p>
<p>So, let's go to the constructor, and define our <code>rect</code> variable:</p>
<p><code>rect = new Rectangle(0, 140, 40, 200)</code></p>
<p>Now we're going to want to draw the paddle onto the screen, but before we write anything in the <code>Draw()</code> function, we need to define a <code>Texture2D</code> for our rectangle.</p>
<p>In <code>Globals.cs</code>, add <code>public static Texture2D pixel;</code> to the instance variables.</p>
<p>In <code>Game1.cs</code>, update the <code>LoadContent()</code> function to this:</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">protected</span> <span class="hljs-keyword">override</span> <span class="hljs-keyword">void</span> <span class="hljs-title">LoadContent</span>(<span class="hljs-params"></span>)</span> {
    Globals.spriteBatch = <span class="hljs-keyword">new</span> SpriteBatch(GraphicsDevice);

    Globals.pixel = <span class="hljs-keyword">new</span> Texture2D(GraphicsDevice, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>);
    Globals.pixel.SetData&lt;Color&gt;(<span class="hljs-keyword">new</span> Color[] { Color.White });
}
</code></pre>
<p>What we're doing here is defining the <code>pixel</code> variable we just created in our <code>Globals</code> class. Through our code, we're basically just creating a white texture.</p>
<p>Now, we can go back to our <code>Paddle.cs</code> class, and update our <code>Draw()</code> function to this:</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Draw</span>(<span class="hljs-params"></span>)</span> {
    Globals.spriteBatch.Draw(Globals.pixel, rect, Color.White);
}
</code></pre>
<p>Here, we are accessing the global <code>spriteBatch</code> variable, and drawing our rectangle to it with our essentially empty texture, with the color white. Because our <code>pixel</code> variable is white, we can set our <code>Color</code> to any color we want, but in this case, we'll just keep it white to stay true to the original Pong.</p>
<p>To get things onto the screen, we need to initialize, update, and draw our <code>Paddle</code> in the <code>Game1</code> class. For every matching function, we need to call it in <code>Game1</code>.</p>
<p>In <code>Game1</code>:</p>
<ul>
<li><p>In the class variables, add <code>Paddle paddle;</code></p>
</li>
<li><p>In <code>Initialize()</code>, add <code>paddle = new Paddle();</code> right before <code>base.Initialize();</code></p>
</li>
<li><p>In <code>Update()</code>, add <code>paddle.update(gameTime);</code> right before <code>base.Update(gameTime);</code></p>
</li>
<li><p>In <code>Draw()</code>, add <code>paddle.Draw();</code> right before <code>Globals.spriteBatch.End();</code></p>
</li>
</ul>
<p>Go ahead and run the project now</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696794050335/dc84ae68-8330-493f-9d54-0df5c143383b.png" alt class="image--center mx-auto" /></p>
<p>If you did everything correctly, we should be seeing our paddle on the screen! The code is now ready to add movement.</p>
<p>Back in the <code>Paddle.cs</code> file, add a new class variable <code>float moveSpeed = 500f;</code></p>
<p>Now, let's fill in the <code>Update()</code> function so we can move the Paddle:</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Update</span>(<span class="hljs-params">GameTime gameTime</span>)</span> {
    KeyboardState kstate = Keyboard.GetState();
    <span class="hljs-keyword">if</span> (kstate.IsKeyDown(Keys.W) &amp;&amp; rect.Y &gt; <span class="hljs-number">0</span>) {
        rect.Y -= (<span class="hljs-keyword">int</span>)(moveSpeed * (<span class="hljs-keyword">float</span>)gameTime.ElapsedGameTime.TotalSeconds);
    }
    <span class="hljs-keyword">if</span> (kstate.IsKeyDown(Keys.S) &amp;&amp; rect.Y &lt; Globals.HEIGHT - rect.Height) {
        rect.Y += (<span class="hljs-keyword">int</span>)(moveSpeed * (<span class="hljs-keyword">float</span>)gameTime.ElapsedGameTime.TotalSeconds);
    }
}
</code></pre>
<p>Let's go through this line by line:</p>
<p><code>KeyboardState kstate = Keyboard.GetState();</code> is getting the current state of the Keyboard, which will give us information on what keys are being pressed.</p>
<p><code>if (kstate.IsKeyDown(Keys.W) &amp;&amp; rect.Y &gt; 0) {</code> is checking to see if the player is pressing <code>W</code> to go up and that the paddle is not being clipped off the top of the screen.</p>
<p><code>rect.Y -= (int)(moveSpeed * (float)gameTime.ElapsedGameTime.TotalSeconds);</code> is changing the Y position of the paddle, so that it can go up based on the <code>moveSpeed</code> multiplied by the time between the last frame and the current frame. We call this delta time (the difference between the previous and current frame).</p>
<p>Delta time helps make smooth movement and animations in games, no matter the frame rate or device performance.</p>
<p><code>if (kstate.IsKeyDown(Keys.S) &amp;&amp; rect.Y &lt; Globals.HEIGHT - rect.Height) {</code> is checking to see if the player is pressing <code>S</code> to go down and that the paddle is not being clipped off the bottom of the screen.</p>
<p><code>rect.Y += (int)(moveSpeed * (float)gameTime.ElapsedGameTime.TotalSeconds);</code> is changing the Y position of the paddle, so that it can go down, just as we did earlier with up.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696794307951/6640b6d4-91b9-48e4-883c-d2f2959c6021.gif" alt class="image--center mx-auto" /></p>
<p>If you've done everything correctly, we should have movement! All we have to do now is add a second player, which should be super easy now that we've made our <code>Paddle</code> class! We just need to allow the constructor to take in a parameter, if it's the second player or not, and then make modifications accordingly.</p>
<p>In the <code>Paddle</code> class, let's add a class variable <code>bool isSecondPlayer;</code></p>
<p>In the constructor, we're going to need to take a boolean parameter so we can make changes based on if the instance of the <code>Paddle</code> class is the second player or not.</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Paddle</span>(<span class="hljs-params"><span class="hljs-keyword">bool</span> isSecondPlayer</span>)</span> {
    <span class="hljs-keyword">this</span>.isSecondPlayer = isSecondPlayer;
    rect = <span class="hljs-keyword">new</span> Rectangle((<span class="hljs-keyword">this</span>.isSecondPlayer ? Globals.WIDTH - <span class="hljs-number">40</span> : <span class="hljs-number">0</span>), <span class="hljs-number">140</span>, <span class="hljs-number">40</span>, <span class="hljs-number">200</span>);
}
</code></pre>
<p>Here, we're telling the computer to change the X variable, so that the paddle is on the right side if there is a second player.</p>
<p>Now, we need to change the <code>Update()</code> class to support a second player.</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Update</span>(<span class="hljs-params">GameTime gameTime</span>)</span> {
    KeyboardState kstate = Keyboard.GetState();
    <span class="hljs-keyword">if</span> ((<span class="hljs-keyword">this</span>.isSecondPlayer ? kstate.IsKeyDown(Keys.Up) : kstate.IsKeyDown(Keys.W)) &amp;&amp; rect.Y &gt; <span class="hljs-number">0</span>) {
        rect.Y -= (<span class="hljs-keyword">int</span>)(moveSpeed * (<span class="hljs-keyword">float</span>)gameTime.ElapsedGameTime.TotalSeconds);
    }
    <span class="hljs-keyword">if</span> ((<span class="hljs-keyword">this</span>.isSecondPlayer ? kstate.IsKeyDown(Keys.Down) : kstate.IsKeyDown(Keys.S)) &amp;&amp; rect.Y &lt; Globals.HEIGHT - rect.Height) {
        rect.Y += (<span class="hljs-keyword">int</span>)(moveSpeed * (<span class="hljs-keyword">float</span>)gameTime.ElapsedGameTime.TotalSeconds);
    }
}
</code></pre>
<p>We're adding a ternary expression again to change the movement key based on if the paddle is the second player or not.</p>
<p>Finally, we need to go into <code>Game1</code> and add a second instance of <code>Paddle</code>, as well as updating the first <code>Paddle</code> instance to work with our new constructor.</p>
<p>In <code>Game1</code>:</p>
<ul>
<li><p>In the class variables, add <code>Paddle paddle2;</code></p>
</li>
<li><p>In <code>Initialize()</code>, update <code>paddle = new Paddle();</code> to <code>paddle = new Paddle(false);</code></p>
</li>
<li><p>In <code>Initialize()</code>, add <code>paddle2 = new Paddle(true);</code> right before <code>base.Initialize();</code></p>
</li>
<li><p>In <code>Update()</code>, add <code>paddle2.update(gameTime);</code> right before <code>base.Update(gameTime);</code></p>
</li>
<li><p>In <code>Draw()</code>, add <code>paddle2.Draw();</code> right before <code>Globals.spriteBatch.End();</code></p>
</li>
</ul>
<p>Now, let's run the code!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696794321901/ccfd3459-a29f-4e92-9e85-cb04f5c081c7.gif" alt class="image--center mx-auto" /></p>
<p>Wonderful! We now have 2 players on the screen. Now, we need to create the <code>Ball</code> class and then make the final touches!</p>
<h1 id="heading-creating-the-ball">Creating the Ball</h1>
<p>Before we create the <code>Ball</code> class, we need to create a way to store the score. In the <code>Globals</code> class, create two new integer variables:</p>
<p><code>public static int player1_score, player2_score;</code></p>
<p>Create a file <code>Ball.cs</code> for the <code>Ball</code> class</p>
<p>A lot of the things we are going to do in terms of drawing the rectangle on the screen are extremely similar to how we did it for the Paddle, so we can use this as our starting point:</p>
<pre><code class="lang-csharp"><span class="hljs-keyword">using</span> Microsoft.Xna.Framework;

<span class="hljs-keyword">namespace</span> <span class="hljs-title">Pong</span> {
    <span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">Ball</span> {
        Rectangle rect;
        <span class="hljs-keyword">int</span> right = <span class="hljs-number">1</span>, top = <span class="hljs-number">1</span>, moveSpeed = <span class="hljs-number">200</span>;

        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-title">Ball</span>(<span class="hljs-params"></span>)</span> {
            rect = <span class="hljs-keyword">new</span> Rectangle(Globals.WIDTH / <span class="hljs-number">2</span> - <span class="hljs-number">20</span>, Globals.HEIGHT / <span class="hljs-number">2</span> - <span class="hljs-number">20</span>, <span class="hljs-number">40</span>, <span class="hljs-number">40</span>);
        }
        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Update</span>(<span class="hljs-params">Gametime gameTime, Paddle player1, Paddle player2</span>)</span> {}
        <span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">Draw</span>(<span class="hljs-params"></span>)</span> {
            Globals.spriteBatch.Draw(Globals.pixel, rect, Color.White);
        }
    }
}
</code></pre>
<p>We're using the Global <code>WIDTH</code> and <code>HEIGHT</code> variables to put the ball at the center of the screen.</p>
<p>Also, we are going to take in both <code>Paddle</code> instances, since we want to use the <code>Ball</code> for all game calculations. This isn't necessarily the best practice, but since it's a simple game like <code>Pong</code>, it's alright.</p>
<p>We also created three new integer class variables <code>right = 1</code>, <code>top = 1</code>, and <code>moveSpeed = 200</code>.</p>
<p>We will use <code>right</code> and <code>top</code> to store either <code>1</code> or <code>-1</code> so that we can create conditionals for when the ball collides with something. We're going to want to keep it simple, so the ball will always move at a 45° angle. By adding the same values to the X and the Y of the ball, it will always bounce at the same 45° angle.</p>
<p>In the <code>Update()</code> function, we're going to add a bunch of if statements back to back to control the ball:</p>
<ul>
<li>Before any if statements, we're just going to move the ball based on its current trajectory. For every frame, we're multiplying the <code>moveSpeed</code> with the delta time, and then making it negative or positive based on the modifiers <code>right</code> and <code>top</code>.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">int</span> deltaSpeed = (<span class="hljs-keyword">int</span>)(moveSpeed * (<span class="hljs-keyword">float</span>)gameTime.ElapsedGameTime.TotalSeconds);
rect.X += right * deltaSpeed;
rect.Y += top * deltaSpeed;
</code></pre>
<ul>
<li>Our first conditional is going to be for the left paddle, the first player. Each value in this if statement is checking each side of the ball and paddle to make sure that if the ball hits the paddle at a valid position, the direction is reversed.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span> (player1.rect.Right &gt; rect.Left &amp;&amp; rect.Top &gt; player1.rect.Top &amp;&amp; rect.Bottom &lt; player1.rect.Bottom) {
    right = <span class="hljs-number">1</span>;
}
</code></pre>
<ul>
<li>The same thing is happening for the right paddle, except the right and left values are flipped so collisions make sense for it.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span> (player2.rect.Left &lt; rect.Right &amp;&amp; rect.Top &gt; player2.rect.Top &amp;&amp; rect.Bottom &lt; player2.rect.Bottom) {
    right = <span class="hljs-number">-1</span>;
}
</code></pre>
<ul>
<li>We check to see if the ball hits the top of the window, and if it does we reverse the top direction so that it can bounce down.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span> (rect.Y &lt; <span class="hljs-number">0</span>) {
    top *= <span class="hljs-number">-1</span>;
}
</code></pre>
<ul>
<li>Then we check to see if the ball hits the bottom of the screen and reverse the top direction, just like before.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span> (rect.Y &gt; Globals.HEIGHT - rect.Height) {
    top *= <span class="hljs-number">-1</span>;
}
</code></pre>
<ul>
<li>For the X values, that means the ball went off the screen. This means we need to award a point to the opposite and reset the game. For this specific if statement, we're checking to see if the ball hits the left side of the screen.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span>(rect.X &lt; <span class="hljs-number">0</span>) {
    Globals.player2_score += <span class="hljs-number">1</span>;
    resetGame();
}
</code></pre>
<ul>
<li>And for this one, we're checking to see if the ball hits the right side of the screen, awarding a point to the first player.</li>
</ul>
<pre><code class="lang-csharp"><span class="hljs-keyword">if</span> (rect.X &gt; Globals.WIDTH - rect.Width) {
    Globals.player1_score += <span class="hljs-number">1</span>;
    resetGame();
}
</code></pre>
<p>We're calling a function <code>resetGame()</code> that we haven't made yet! We want this to put the ball back at the center of the game, which should be easy for us. Go ahead and create it as part of the <code>Ball</code> class.</p>
<pre><code class="lang-csharp"><span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">resetGame</span>(<span class="hljs-params"></span>)</span> {
    rect.X = Globals.WIDTH / <span class="hljs-number">2</span> - <span class="hljs-number">20</span>;
    rect.Y = Globals.HEIGHT / <span class="hljs-number">2</span> - <span class="hljs-number">20</span>;
}
</code></pre>
<p>Now, just like we did for the paddles, we need to add an instance of <code>Ball</code> to the <code>Game1</code> class.</p>
<p>In <code>Game1</code>:</p>
<ul>
<li><p>In the class variables, add <code>Ball ball;</code></p>
</li>
<li><p>In <code>Initialize()</code>, add <code>ball = new Ball();</code> right before <code>base.Initialize();</code></p>
</li>
<li><p>In <code>Update()</code>, add <code>ball.update(gameTime, paddle, paddle2);</code> right before <code>base.Update(gameTime);</code></p>
</li>
<li><p>In <code>Draw()</code>, add <code>ball.Draw();</code> right before <code>Globals.spriteBatch.End();</code></p>
</li>
</ul>
<p>Let's run it and see our progress!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696740189673/a01ccc6e-0aa5-4481-810c-70d90735e6a5.gif" alt class="image--center mx-auto" /></p>
<p>We now have a working Pong! The only thing missing now is a score.</p>
<h1 id="heading-showing-the-score">Showing the Score</h1>
<p>In MonoGame, displaying text on the screen requires loading a font resource, typically in the form of a <code>SpriteFont</code>. This allows you to present information such as scores, game instructions, and other textual elements in your MonoGame project.</p>
<p>For our project, we're going to use it to show the scores on the top.</p>
<p>If you're using <code>Visual Studio 2022</code>, go to your files, open the <code>Content</code> folder, right click the <code>Content.mgcb</code> file, and open it with the MGCB Editor.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696741512010/37f7b5e4-26ab-4bed-b73e-f6c0d952d249.png" alt class="image--center mx-auto" /></p>
<p>If you're using the command line, type <code>dotnet mgcb-editor</code> to open up the editor.</p>
<p>Whichever way, it should open the MGCB editor, which is the UI interface for content management.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696741417729/fce7f3cf-92c0-47bb-ad3e-530c2cb38d01.png" alt class="image--center mx-auto" /></p>
<p>If your Content file isn't already there, go to the top, click <code>File &gt; Open...</code>, and navigate to your <code>Content.mgcb</code> file.</p>
<p>After you add it, right-click the <code>Content</code> add click <code>Add &gt; New Item...</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696741642471/8fcf002f-9236-406c-a228-bbffc312b293.png" alt class="image--center mx-auto" /></p>
<p>Create a <code>SpriteFont Description</code> called <code>Score</code>. This will generate a <code>.spritefont</code> file that we can use to apply settings for rendering to the screen.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696741724182/5dfc27d1-b19e-41ac-8cd9-f1b62df163e3.png" alt class="image--center mx-auto" /></p>
<p>Let's save and exit out of the MGCB Editor.</p>
<p>The <code>Score.spritefont</code> file we created is just a standard XML file, that defines properties of the font. By default, the font selected is Arial, which is fine, but you can go in there and make it any font you want, just make sure that you add the <code>.ttf</code> file to the <code>Content</code> folder if you don't already have it installed on your machine.</p>
<p>For me, I downloaded the <a target="_blank" href="https://fonts.google.com/specimen/Press+Start+2P">Press Start Font</a> , and put the file in the <code>Content</code> folder, then opened the <code>Score.sprintefont</code> file and changed it from Arial to the name of the <code>.ttf</code> file, in my case <code>prstart.</code> I chose this font because it has a nice 8-bit style.</p>
<p>To add our font to the game, open <code>Game1.cs</code> and:</p>
<ul>
<li><p>In the class variables, add <code>SpriteFont font;</code></p>
</li>
<li><p>In <code>LoadContent()</code>, add <code>font = Content.Load&lt;SpriteFont&gt;("Score");</code> at the bottom of the function.</p>
</li>
</ul>
<p>Add two <code>DrawString</code> calls to our <code>Draw()</code> function, underneath <code>Globals.spriteBatch.Begin()</code>:</p>
<pre><code class="lang-csharp">Globals.spriteBatch.DrawString(font, Globals.player1_score.ToString(), <span class="hljs-keyword">new</span> Vector2(<span class="hljs-number">100</span>, <span class="hljs-number">50</span>), Color.White);
Globals.spriteBatch.DrawString(font, Globals.player2_score.ToString(), <span class="hljs-keyword">new</span> Vector2(Globals.WIDTH - <span class="hljs-number">112</span>, <span class="hljs-number">50</span>), Color.White);
</code></pre>
<p>Let's save the file and run our project!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1696749298019/f0ffff2d-f043-4b65-a465-6ece7e32799a.gif" alt class="image--center mx-auto" /></p>
<p>There we go! We now have a fully working version of Pong!</p>
<p>Great job on completing the Pong game tutorial using MonoGame and C#! This achievement has not only helped you build a classic game but also allowed you to gain hands-on experience in game development concepts. As you continue exploring new projects and honing your skills, you will become more proficient in game development.</p>
<p>Explore more advanced MonoGame features or other Game Engines/Frameworks, experiment with different game mechanics, learn about game design principles, and create a portfolio of game projects.</p>
<p>Keep up the good work and happy coding!</p>
]]></content:encoded></item><item><title><![CDATA[<Part 2> Make spinning 3D shapes in SDL2 and OpenGL]]></title><description><![CDATA[Last part, we successfully created a window in SDL2 using OpenGL. Now that we understand how that works, we're going to learn how to render things on the screen by building a mesh class and creating shaders in GLSL, a shader programming language.
DIS...]]></description><link>https://hadicya.dev/part-2-make-spinning-3d-shapes-in-sdl2-and-opengl</link><guid isPermaLink="true">https://hadicya.dev/part-2-make-spinning-3d-shapes-in-sdl2-and-opengl</guid><category><![CDATA[C++]]></category><category><![CDATA[graphicsprogramming]]></category><category><![CDATA[openGL]]></category><category><![CDATA[SDL2]]></category><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Fri, 25 Aug 2023 03:42:01 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1692916852591/ad5dac4b-3525-4b9d-8ca9-df4296a67c39.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last part, we successfully created a window in SDL2 using OpenGL. Now that we understand how that works, we're going to learn how to render things on the screen by building a mesh class and creating shaders in GLSL, a shader programming language.</p>
<p><em>DISCLAIMER: This walkthrough assumes knowledge from Part 1 and a working knowledge of C++ and how to compile it.</em></p>
<p>GitHub Repo: <a target="_blank" href="https://github.com/HadiCya/spinning_shapes">https://github.com/HadiCya/spinning_shapes</a></p>
<p>YouTube Version:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/ac6mf05O_qw">https://youtu.be/ac6mf05O_qw</a></div>
<p> </p>
<h1 id="heading-creating-our-mesh-class">Creating our Mesh class</h1>
<p>To start us off, we’re going to need to create a <code>Mesh</code> class for us to create <code>Mesh</code> objects, as eventually we’re going to make this 3D.</p>
<p>Go ahead and create a <code>mesh.h</code> header file:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">ifndef</span> MESH_H</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">define</span> MESH_H</span>

<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glad/glad.h&gt;</span></span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Mesh</span> {</span>
    <span class="hljs-keyword">public</span>:
        Mesh();
        <span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">draw</span><span class="hljs-params">()</span></span>;
    <span class="hljs-keyword">private</span>:
        GLuint VertexArrayID, vertexbuffer, elementbuffer, vertex_size;
};

<span class="hljs-meta">#<span class="hljs-meta-keyword">endif</span></span>
</code></pre>
<p>We will construct a class called <code>Mesh</code>, which will set us up with the mesh information such as the vertices and what order to draw them in, along with setting up the vertices and triangle points.</p>
<p>Before we write code, let's take a deep dive into understanding how drawing vertices on the screen works. In computer graphics, a polygon mesh is used to render by drawing triangles. If you see an advanced character in a movie or video game, they are made up of thousands of tiny triangles. Triangles are used because of their ease of manipulation and require less storage, something that becomes important as projects grow in size and complexity.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1692917845190/cdaa75bf-fff0-4f69-9b55-1902430fbb67.png" alt class="image--center mx-auto" /></p>
<p><a target="_blank" href="https://en.wikipedia.org/wiki/Utah_teapot">Utah Teapot</a></p>
<p>The way a polygon mesh stores its data is by utilizing a triangle and vertex array. The triangle array is responsible for storing which vertices to access in the vertex array. For example, if we have the triangle array <code>[0, 2, 3, 0, 3, 1]</code> the first triangle we draw accesses the vertices at positions <code>0, 2, 3</code> and then connecting them on the screen, creating a triangle.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1692927247689/578e3dd6-6748-4b60-89c6-0947c0be9c98.png" alt class="image--center mx-auto" /></p>
<p>In OpenGL, we manage this data with buffers, which are responsible for storing triangle and vertex data to be drawn on the screen.</p>
<p>Let's create our <code>mesh.cpp</code> file, and we'll see how we can implement this, and how it all works.</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;vector&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"mesh.h"</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;SDL.h&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>

Mesh::Mesh() {
    glGenVertexArrays(<span class="hljs-number">1</span>, &amp;VertexArrayID);
    glBindVertexArray(VertexArrayID);

    GLfloat vertices[] = {
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>,
        <span class="hljs-number">-0.5f</span>, <span class="hljs-number">0.5f</span>, <span class="hljs-number">0.5f</span>
    };

    GLint triangles[] = {<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">0</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>};

    glGenBuffers(<span class="hljs-number">1</span>, &amp;vertexbuffer);
    glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer);
    glBufferData(GL_ARRAY_BUFFER, <span class="hljs-keyword">sizeof</span>(vertices), vertices, GL_STATIC_DRAW);

    glGenBuffers(<span class="hljs-number">1</span>, &amp;elementbuffer);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementbuffer);
    glBufferData(GL_ELEMENT_ARRAY_BUFFER, <span class="hljs-keyword">sizeof</span>(triangles), triangles, GL_STATIC_DRAW);

    vertex_size = <span class="hljs-keyword">sizeof</span>(triangles) / <span class="hljs-keyword">sizeof</span>(GLint);
}
</code></pre>
<p><code>glGenVertexArrays(1, &amp;VertexArrayID)</code> and <code>glBindVertexArray(VertexArrayID)</code> creates and binds a vertex array object and its ID so that subsequent vertex buffer operations can be stored in the new object.</p>
<p>We then define our <code>vertices</code> and <code>triangles</code> arrays, like we did earlier in the diagram.</p>
<p>Now, we have to generate buffers to store our data.</p>
<p>The vertex buffer is a mechanism that sends the vertex data to the GPU, we are:</p>
<ol>
<li><p>Generating a unique ID for the buffer.</p>
</li>
<li><p>Binding the buffer, setting it as active for OpenGL to operate with.</p>
</li>
<li><p>Allocating the appropriate amount of memory, and copying the data to the GPU. <code>GL_STATIC_DRAW</code> specifies that we are giving it the same unchanging data, that will be drawn many times.</p>
</li>
</ol>
<p>The element buffer, also known as an index buffer, sends the triangle index data to the GPU.</p>
<p>This series of calls does the same as for the vertex buffer but for the index data. <code>GL_ELEMENT_ARRAY_BUFFER</code> tells OpenGL that this buffer contains index data.</p>
<p>Lastly, we'll store vertex_size data for when we call the <code>draw()</code> function.</p>
<p>Now, we need to make our <code>draw()</code> function:</p>
<pre><code class="lang-cpp"><span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">Mesh::draw</span><span class="hljs-params">()</span></span>{
    glEnableVertexAttribArray(<span class="hljs-number">0</span>);
    glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer);
    glVertexAttribPointer(<span class="hljs-number">0</span>, <span class="hljs-number">3</span>, GL_FLOAT, GL_FALSE, <span class="hljs-number">0</span>, (<span class="hljs-keyword">void</span>*)<span class="hljs-number">0</span>);

    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementbuffer);
    glDrawElements(GL_TRIANGLES, vertex_size, GL_UNSIGNED_INT, <span class="hljs-number">0</span>);

    glDisableVertexAttribArray(<span class="hljs-number">0</span>);
}
</code></pre>
<p>Let's go through this function line by line:</p>
<ul>
<li><p><code>glEnableVertexAttribArray(0)</code>: Enable the first attribute array at position 0. We're simply allowing ourselves to start the drawing process.</p>
</li>
<li><p><code>glBindBuffer(GL_ARRAY_BUFFER, vertexbuffer)</code>: Bind the vertex buffer.</p>
</li>
<li><p><code>glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, (void*)0)</code>: Describe how the data for the vertices is stored in the vertex buffer.</p>
<ol>
<li><p><code>Index: 0</code> is telling the vertex shader (which we will touch on more), to get its vertex data from the vertex attribute at position 0.</p>
</li>
<li><p><code>Size: 3</code> represents how many vertices we go through to make up 1 point. We want every 3 floats in the vertices array to represent one vertex, such that it represents <code>x, y, z</code> (3D Coordinate).</p>
</li>
<li><p><code>Type: GL_FLOAT</code> is the type of data we are using.</p>
</li>
<li><p><code>Normalized: GL_FALSE</code> specifies that fixed-point data values are accessed directly, instead of being normalized.</p>
</li>
<li><p><code>Stride: 0</code> specifies the distance between data in our buffer. Since we are only storing vertex information back-to-back, this is 0. If we were to, for example, store RGB values after every vertex, we would set this to <code>6 * sizeof(float)</code></p>
</li>
<li><p><code>Pointer: (void*)0</code> stores the pointer to where the attribute starts, and in our case, is 0.</p>
</li>
</ol>
</li>
<li><p><code>glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, elementbuffer)</code>: Bind the element buffer.</p>
</li>
<li><p><code>glDrawElements(GL_TRIANGLES, vertex_size, GL_UNSIGNED_INT, 0)</code>: Draw the triangles using the vertex and element buffers, according to the total size.</p>
</li>
<li><p><code>glDisableVertexAttribArray(0)</code>: Disable the first attribute array after drawing.</p>
</li>
</ul>
<h1 id="heading-writing-our-shaders">Writing our shaders</h1>
<p>Before we can start drawing things on the screen, OpenGL requires us to have at minimum a fragment and vertex shader. We're going to be making these in the Graphics Library Shader Language (GLSL) which is a shader language based on the C programming language. Shaders aren't our primary focus for this series, so we're going to be covering the basics.</p>
<p>First, we’ll create a file called <code>vertex.glsl</code> to store our Vertex Shader:</p>
<pre><code class="lang-c"><span class="hljs-meta">#version 330 core</span>

layout (location = <span class="hljs-number">0</span>) in vec3 aPos;

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">()</span> </span>{
    gl_Position = vec4(aPos, <span class="hljs-number">1.0</span>);
}
</code></pre>
<p>The vertex shader is responsible for telling the GPU where every point is to be drawn, by transforming its position and other attributes using mathematic calculations. In this case, we're not going to be doing much.</p>
<ul>
<li><p><code>#version 330 core</code>: This sets the GLSL version to 3.30 and specifies the core profile. (In the last part we talked about using core over compatibility)</p>
</li>
<li><p><code>layout (location = 0) in vec3 aPos</code>: This declares a 3-component vector input variable <code>aPos</code>, which represents the position of the vertex. The <code>layout (location = 0)</code> part explicitly sets the location of this attribute to 0.</p>
</li>
<li><p><code>void main() { ... }</code>: The main function of the shader. It's executed once for every vertex.</p>
</li>
<li><p><code>gl_Position = vec4(aPos, 1.0)</code>: This converts the input 3D position into a 4D vector by adding a fourth component with a value of 1.0. This is common in graphics to represent homogeneous coordinates.</p>
</li>
</ul>
<p>Now we want to create a file called <code>fragment.glsl</code> for our Fragment Shader</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#version 330 core</span>

out vec3 color;

<span class="hljs-function"><span class="hljs-keyword">void</span> <span class="hljs-title">main</span><span class="hljs-params">()</span> </span>{
    color = vec3(<span class="hljs-number">1</span>,<span class="hljs-number">1</span>,<span class="hljs-number">1</span>);
}
</code></pre>
<p>The fragment shader is responsible for telling the GPU what color information needs to be drawn, for every pixel that our geometry covers. This will also be simple, especially since later we will just be using the wireframe setting.</p>
<ul>
<li><p><code>out vec3 color</code>: This declares a 3-component vector output variable <code>color</code>, which will store the output color of the fragment.</p>
</li>
<li><p><code>void main() { ... }</code>: The main function of the shader, executed once for every fragment (potential pixel on the screen).</p>
</li>
<li><p><code>color = vec3(1,1,1)</code>: This sets the output color to white (1,1,1), meaning every fragment processed by this shader will have this color.</p>
</li>
</ul>
<p>We’re also going to use an existing shader loader and header. This isn’t necessary to understand OpenGL completely, go ahead and just copy the code directly.</p>
<p>Create the file <code>loadShader.h</code> and populate it with this code:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">ifndef</span> LOAD_SHADER_H</span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">define</span> LOAD_SHADER_H</span>

<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glad/glad.h&gt;</span></span>

<span class="hljs-function">GLuint <span class="hljs-title">LoadShaders</span><span class="hljs-params">(<span class="hljs-keyword">const</span> <span class="hljs-keyword">char</span> * vertex_file_path, <span class="hljs-keyword">const</span> <span class="hljs-keyword">char</span> * fragment_file_path)</span></span>;

<span class="hljs-meta">#<span class="hljs-meta-keyword">endif</span> <span class="hljs-comment">// LOAD_SHADER_H</span></span>
</code></pre>
<p>Create the file <code>loadShader.cpp</code> and populate it with this code:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"loadShader.h"</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;SDL.h&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;fstream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;sstream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>

<span class="hljs-function">GLuint <span class="hljs-title">LoadShaders</span><span class="hljs-params">(<span class="hljs-keyword">const</span> <span class="hljs-keyword">char</span> * vertex_file_path,<span class="hljs-keyword">const</span> <span class="hljs-keyword">char</span> * fragment_file_path)</span></span>{

 <span class="hljs-comment">// Create the shaders</span>
 GLuint VertexShaderID = glCreateShader(GL_VERTEX_SHADER);
 GLuint FragmentShaderID = glCreateShader(GL_FRAGMENT_SHADER);

 <span class="hljs-comment">// Read the Vertex Shader code from the file</span>
 <span class="hljs-built_in">std</span>::<span class="hljs-built_in">string</span> VertexShaderCode;
 <span class="hljs-function"><span class="hljs-built_in">std</span>::ifstream <span class="hljs-title">VertexShaderStream</span><span class="hljs-params">(vertex_file_path, <span class="hljs-built_in">std</span>::ios::in)</span></span>;
 <span class="hljs-keyword">if</span>(VertexShaderStream.is_open()){
  <span class="hljs-built_in">std</span>::<span class="hljs-built_in">stringstream</span> sstr;
  sstr &lt;&lt; VertexShaderStream.rdbuf();
  VertexShaderCode = sstr.str();
  VertexShaderStream.close();
 }<span class="hljs-keyword">else</span>{
  <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Impossible to open %s. Are you in the right directory ? Don't forget to read the FAQ !\n"</span>, vertex_file_path);
  getchar();
  <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
 }

 <span class="hljs-comment">// Read the Fragment Shader code from the file</span>
 <span class="hljs-built_in">std</span>::<span class="hljs-built_in">string</span> FragmentShaderCode;
 <span class="hljs-function"><span class="hljs-built_in">std</span>::ifstream <span class="hljs-title">FragmentShaderStream</span><span class="hljs-params">(fragment_file_path, <span class="hljs-built_in">std</span>::ios::in)</span></span>;
 <span class="hljs-keyword">if</span>(FragmentShaderStream.is_open()){
  <span class="hljs-built_in">std</span>::<span class="hljs-built_in">stringstream</span> sstr;
  sstr &lt;&lt; FragmentShaderStream.rdbuf();
  FragmentShaderCode = sstr.str();
  FragmentShaderStream.close();
 }

 GLint Result = GL_FALSE;
 <span class="hljs-keyword">int</span> InfoLogLength;

 <span class="hljs-comment">// Compile Vertex Shader</span>
 <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Compiling shader : %s\n"</span>, vertex_file_path);
 <span class="hljs-keyword">char</span> <span class="hljs-keyword">const</span> * VertexSourcePointer = VertexShaderCode.c_str();
 glShaderSource(VertexShaderID, <span class="hljs-number">1</span>, &amp;VertexSourcePointer , <span class="hljs-literal">NULL</span>);
 glCompileShader(VertexShaderID);

 <span class="hljs-comment">// Check Vertex Shader</span>
 glGetShaderiv(VertexShaderID, GL_COMPILE_STATUS, &amp;Result);
 glGetShaderiv(VertexShaderID, GL_INFO_LOG_LENGTH, &amp;InfoLogLength);
 <span class="hljs-keyword">if</span> ( InfoLogLength &gt; <span class="hljs-number">0</span> ){
  <span class="hljs-function"><span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;<span class="hljs-keyword">char</span>&gt; <span class="hljs-title">VertexShaderErrorMessage</span><span class="hljs-params">(InfoLogLength+<span class="hljs-number">1</span>)</span></span>;
  glGetShaderInfoLog(VertexShaderID, InfoLogLength, <span class="hljs-literal">NULL</span>, &amp;VertexShaderErrorMessage[<span class="hljs-number">0</span>]);
  <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%s\n"</span>, &amp;VertexShaderErrorMessage[<span class="hljs-number">0</span>]);
 }

 <span class="hljs-comment">// Compile Fragment Shader</span>
 <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Compiling shader : %s\n"</span>, fragment_file_path);
 <span class="hljs-keyword">char</span> <span class="hljs-keyword">const</span> * FragmentSourcePointer = FragmentShaderCode.c_str();
 glShaderSource(FragmentShaderID, <span class="hljs-number">1</span>, &amp;FragmentSourcePointer , <span class="hljs-literal">NULL</span>);
 glCompileShader(FragmentShaderID);

 <span class="hljs-comment">// Check Fragment Shader</span>
 glGetShaderiv(FragmentShaderID, GL_COMPILE_STATUS, &amp;Result);
 glGetShaderiv(FragmentShaderID, GL_INFO_LOG_LENGTH, &amp;InfoLogLength);
 <span class="hljs-keyword">if</span> ( InfoLogLength &gt; <span class="hljs-number">0</span> ){
  <span class="hljs-function"><span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;<span class="hljs-keyword">char</span>&gt; <span class="hljs-title">FragmentShaderErrorMessage</span><span class="hljs-params">(InfoLogLength+<span class="hljs-number">1</span>)</span></span>;
  glGetShaderInfoLog(FragmentShaderID, InfoLogLength, <span class="hljs-literal">NULL</span>, &amp;FragmentShaderErrorMessage[<span class="hljs-number">0</span>]);
  <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%s\n"</span>, &amp;FragmentShaderErrorMessage[<span class="hljs-number">0</span>]);
 }

 <span class="hljs-comment">// Link the program</span>
 <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Linking program\n"</span>);
 GLuint ProgramID = glCreateProgram();
 glAttachShader(ProgramID, VertexShaderID);
 glAttachShader(ProgramID, FragmentShaderID);
 glLinkProgram(ProgramID);

 <span class="hljs-comment">// Check the program</span>
 glGetProgramiv(ProgramID, GL_LINK_STATUS, &amp;Result);
 glGetProgramiv(ProgramID, GL_INFO_LOG_LENGTH, &amp;InfoLogLength);
 <span class="hljs-keyword">if</span> ( InfoLogLength &gt; <span class="hljs-number">0</span> ){
  <span class="hljs-function"><span class="hljs-built_in">std</span>::<span class="hljs-built_in">vector</span>&lt;<span class="hljs-keyword">char</span>&gt; <span class="hljs-title">ProgramErrorMessage</span><span class="hljs-params">(InfoLogLength+<span class="hljs-number">1</span>)</span></span>;
  glGetProgramInfoLog(ProgramID, InfoLogLength, <span class="hljs-literal">NULL</span>, &amp;ProgramErrorMessage[<span class="hljs-number">0</span>]);
  <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%s\n"</span>, &amp;ProgramErrorMessage[<span class="hljs-number">0</span>]);
 }

 glDetachShader(ProgramID, VertexShaderID);
 glDetachShader(ProgramID, FragmentShaderID);

 glDeleteShader(VertexShaderID);
 glDeleteShader(FragmentShaderID);

 <span class="hljs-keyword">return</span> ProgramID;
}
</code></pre>
<p>Credit to: <a target="_blank" href="http://www.opengl-tutorial.org/">https://www.opengl-tutorial.org/</a> for creating the shaders loader.</p>
<h1 id="heading-making-it-all-work">Making it all work</h1>
<p>Now that we have our <code>Mesh</code> class completed, and our shaders set up, all we have to do now is make it work in our <code>main.cpp</code></p>
<p>Let’s update our includes in <code>main.cpp</code> to:</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;SDL.h&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"mesh.h"</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">"loadShader.h"</span></span>
</code></pre>
<p>After we call <code>gladLoadGLLoader(SDL_GL_GetProcAddress)</code> we’re going to want to initialize our <code>Mesh</code> and load our shaders:</p>
<pre><code class="lang-cpp">    SDL_GLContext context;
    context = SDL_GL_CreateContext(window);

    gladLoadGLLoader(SDL_GL_GetProcAddress);

    Mesh cube;

    GLuint programID = LoadShaders(<span class="hljs-string">"vertex.glsl"</span>, <span class="hljs-string">"fragment.glsl"</span>);

    <span class="hljs-keyword">bool</span> done = <span class="hljs-literal">false</span>;
</code></pre>
<p>Lastly, in our while loop, we’re going to want to actually draw our <code>Mesh</code> but not before we tell OpenGL to use the shaders we loaded earlier into <code>programID</code></p>
<p>We’re also going to call <code>glPolygonMode(GL_FRONT_AND_BACK, GL_LINE)</code> which is going to render everything on the screen in wireframe mode. This is going to show how the triangles are being drawn, as we described earlier. And also, wireframe looks cool.</p>
<pre><code class="lang-cpp">    <span class="hljs-keyword">while</span>(!done) {
        glViewport(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, screen_width, screen_height);

        SDL_Event event;
        <span class="hljs-keyword">while</span>(SDL_PollEvent(&amp;event)){
            <span class="hljs-keyword">if</span>(event.type == SDL_QUIT) {
                done = <span class="hljs-literal">true</span>;
            }
        }
        glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);

        glUseProgram(programID);
        glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);

        cube.draw();    

        SDL_GL_SwapWindow(window);
</code></pre>
<p>That’s all the changes we need! Go ahead and compile your code yet again, this time adding <code>mesh.cpp</code> and <code>loadShaders.cpp</code> to your compilation list. This is how my compilation command looks now:</p>
<p><code>clang++ -std=c++20 main.cpp mesh.cpp loadShader.cpp ./glad/src/glad.c -o spinning_shapes -I/Library/Frameworks/SDL2.framework/Headers -I./glad/include -F/Library/Frameworks -framework SDL2</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1692914937898/dbee45f1-fa57-4d5e-8e4f-714abccd58e0.png" alt class="image--center mx-auto" /></p>
<p>If you see a slightly stretched-out square, congrats! You’ve successfully rendered a “square” onto the screen.</p>
<p>You might be asking if our vertices are uniform, why is it stretched like that?</p>
<p>That, and how we're going to make this a spinning cube, will be answered in our next and final part!</p>
<p>Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[<Part 1> Make spinning 3D shapes in SDL2 and OpenGL]]></title><description><![CDATA[In this miniseries, you will be creating a program in C++ that will be able to render 3D shapes in SDL2 and OpenGL, and more importantly, learn about what is happening in depth while you write your program.
As a game developer, using SDL2 and OpenGL ...]]></description><link>https://hadicya.dev/part-1-make-spinning-3d-shapes-in-sdl2-and-opengl</link><guid isPermaLink="true">https://hadicya.dev/part-1-make-spinning-3d-shapes-in-sdl2-and-opengl</guid><category><![CDATA[C++]]></category><category><![CDATA[openGL]]></category><category><![CDATA[SDL2]]></category><category><![CDATA[Game Development]]></category><category><![CDATA[graphicsprogramming]]></category><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Thu, 17 Aug 2023 09:26:35 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1692913492369/8d6bfc94-fb00-4499-b960-ee16e81d873e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this miniseries, you will be creating a program in C++ that will be able to render 3D shapes in SDL2 and OpenGL, and more importantly, learn about what is happening in depth while you write your program.</p>
<p>As a game developer, using SDL2 and OpenGL in a world of advanced game engines such as Unity and Unreal Engine may seem like a waste of time. And you’d probably be right. While programming at a low level may have benefits such as program speed, reliability, and smaller sizes, many games you’d be able to produce with a low-level framework (in a timely manner) can be made in a game engine much faster, and it’d run fine on low-spec machines, and great on average machines.</p>
<p>So, why bother?</p>
<p>By learning the components of rendering in a low-level setting, you gain a deeper understanding of how a game engine works under the hood, which in turn will teach you how to optimize your game better, making you a better game developer. And more importantly, because it is cool. That’s why I do it at least. There are many other technical reasons that could be argued, but we needn’t dwell, and just begin.</p>
<p>Let’s get started!</p>
<p><em>DISCLAIMER: This walkthrough assumes a working knowledge of C++ and how to compile it.</em></p>
<p>GitHub Repo: <a target="_blank" href="https://github.com/HadiCya/spinning_shapes">https://github.com/HadiCya/spinning_shapes</a></p>
<p>YouTube version:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/ac6mf05O_qw">https://youtu.be/ac6mf05O_qw</a></div>
<p> </p>
<h2 id="heading-what-youll-need"><strong>What you’ll need</strong></h2>
<p>First things first, let’s get set up. The tools we’ll need for this are (continue reading for more detailed instructions):</p>
<ul>
<li><p><a target="_blank" href="https://www.khronos.org/opengl/wiki/Getting_Started#Downloading_OpenGL">OpenGL</a> (OpenGL is most likely installed by default on your machine, but you must ensure the latest version is installed)</p>
</li>
<li><p>A C++ compiler (I use clang++ and compile C++20)</p>
</li>
<li><p><a target="_blank" href="https://www.libsdl.org/">SDL 2.0</a></p>
</li>
<li><p><a target="_blank" href="https://glad.dav1d.de/">GLAD</a></p>
</li>
<li><p><a target="_blank" href="https://glm.g-truc.net/0.9.9/">GLM</a></p>
</li>
</ul>
<p>Before we get started, create a directory to store this project in, I named mine <code>spinning_shapes</code></p>
<p>SDL2 is a development library built in C, that provides low-level interfaces to computer graphics hardware (with graphics libraries such as OpenGL), audio, video, peripherals, etc. We will be using it to actually display a window capable of rendering visuals.</p>
<p>Download SDL2 from the released tab on the <a target="_blank" href="https://www.libsdl.org/">SDL2 website</a>. Follow this <a target="_blank" href="https://wiki.libsdl.org/SDL2/Installation">installation guide</a> for more specifics about your operating system.</p>
<p>Whenever you want to access OpenGL, you’ll need some kind of loading library. Many of them have minor differences and accomplish exceedingly similar things. We will be using GLAD, a multi-language loader-generator for OpenGL. The GLAD web tool we will be using will be able to generate a header file for us.</p>
<p>Go to the official <a target="_blank" href="https://glad.dav1d.de/">GLAD loader-generator website</a>, here you will select <strong>C/C++</strong> under <strong>Language</strong> and <strong>OpenGL</strong> under <strong>Specification.</strong></p>
<p>Under the API tab, we’re going to be selecting gl <strong>Version 4.1,</strong> and none for the rest. For <strong>Profile,</strong> we will be using <strong>Core</strong> instead of <strong>Compatibility.</strong> The Extensions column on the left is going to be populated, and we’re going to go ahead and click <strong>ADD ALL,</strong> and click the <strong>GENERATE</strong> button on the bottom.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*qbOHo3vDMP4RMhkkS3jjAA.png" alt="GLAD Loader-Generator Web Tool" /></p>
<p>After you generate your GLAD zip, download it, and extract the glad folder into your project folder.</p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*4ISXO6GVoVbqFJbQVF0Mdw.png" alt="GLAD Generated Files" /></p>
<p>We’re going to need to be capable of making vector and matrix calculations that we wouldn’t otherwise be able to in OpenGL. This is where GLM comes in, a mathematics library for OpenGL and the OpenGL Shading Language (GLSL). Go ahead and download the latest release from the <a target="_blank" href="https://glm.g-truc.net/">GLM Github</a>. Extract it and put it in the project directory.</p>
<p>That’s all the downloads/installations we’ll need!</p>
<p>As I’m sure you’ve already discovered, installing and configuring these C++ libraries can prove to be difficult from machine to machine, and you’re going to probably have to do your own research to get things working especially if you’re a beginner. It’s all part of the learning. You can see what exactly works for me on my <a target="_blank" href="https://youtu.be/ac6mf05O_qw">YouTube video</a>.</p>
<h2 id="heading-lets-make-a-window"><strong>Let's make a window</strong></h2>
<p>Let’s begin by creating a file called <code>main.cpp</code> <strong>,</strong> this will be what will run our main C++ code</p>
<pre><code class="lang-cpp"><span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;iostream&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;glad/glad.h&gt;</span></span>
<span class="hljs-meta">#<span class="hljs-meta-keyword">include</span> <span class="hljs-meta-string">&lt;SDL.h&gt;</span></span>

<span class="hljs-function"><span class="hljs-keyword">int</span> <span class="hljs-title">main</span><span class="hljs-params">(<span class="hljs-keyword">int</span> argc, <span class="hljs-keyword">char</span>* argv[])</span></span>{
    SDL_Window* window = <span class="hljs-literal">nullptr</span>;

    <span class="hljs-keyword">int</span> screen_width = <span class="hljs-number">1000</span>;
    <span class="hljs-keyword">int</span> screen_height = <span class="hljs-number">800</span>;

    <span class="hljs-keyword">if</span> (SDL_Init(SDL_INIT_VIDEO) &lt; <span class="hljs-number">0</span>) {
        <span class="hljs-built_in">std</span>::<span class="hljs-built_in">cout</span> &lt;&lt; <span class="hljs-string">"SDL failed initialization. "</span> &lt;&lt; SDL_GetError();
        <span class="hljs-keyword">return</span> <span class="hljs-number">-1</span>;
    } <span class="hljs-keyword">else</span> {
        <span class="hljs-built_in">std</span>::<span class="hljs-built_in">cout</span> &lt;&lt; <span class="hljs-string">"SDL successfully initialized."</span>;
    }
</code></pre>
<p>We’ll need to start the program off by including <code>&lt;iostream&gt;</code> along with <code>&lt;glad/glad.h&gt;</code> and <code>&lt;SDL.h&gt;</code></p>
<p>In our <code>main()</code> function, we want to initialize a <code>SDL_Window</code> which we will point to an empty pointer. We will be assigning this later, once we’ve set up some OpenGL attributes.</p>
<p>We will define our screen height and width, and then check to see if SDL successfully initializes, returning an error if it doesn’t.</p>
<pre><code class="lang-cpp">    SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, <span class="hljs-number">4</span>);
    SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, <span class="hljs-number">1</span>);
    SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_CORE);
    SDL_GL_SetAttribute(SDL_GL_DOUBLEBUFFER, <span class="hljs-number">1</span>);
    SDL_GL_SetAttribute(SDL_GL_DEPTH_SIZE, <span class="hljs-number">24</span>);

    window = SDL_CreateWindow(<span class="hljs-string">"C++ SDL2 Window | OpenGL"</span>, 
            <span class="hljs-number">0</span>, 
            <span class="hljs-number">0</span>, 
            screen_width, 
            screen_height, 
            SDL_WINDOW_SHOWN | SDL_WINDOW_OPENGL);
    SDL_GLContext context;
    context = SDL_GL_CreateContext(window);

    gladLoadGLLoader(SDL_GL_GetProcAddress);
</code></pre>
<p>This is a lot of SDL and OpenGL! It is daunting at first glance, but we will be breaking it down to see what exactly is happening, in a digestible way.</p>
<p><code>SDL_GL_SetAttribute()</code> is a function in SDL2 for setting window attributes in OpenGL before the window is created. Let’s go through this line by line:</p>
<ul>
<li><p><code>SDL_GL_CONTEXT_MAJOR_VERSION</code> By setting this attribute to 4, we are telling SDL we intend to use version 4.x.</p>
</li>
<li><p><code>SDL_GL_CONTEXT_MINOR_VERSION</code> This is completed when you set the minor version, and SDL now knows we intend to use version 4.1, when we set this attribute to 1.</p>
</li>
<li><p><code>SDL_GL_CONTEXT_PROFILE_MASK</code> is a way to define OpenGL context, which just means which features we want available. When setting up GLAD earlier, we chose Core over Compatibility. Compatibility supports a Legacy and an entirely different format of rendering objects on the screen, which isn’t necessary for us. That’s why it’s important to specify what is available to SDL and set our mask to <code>SDL_GL_CONTEXT_PROFILE_CORE</code></p>
</li>
<li><p><code>SDL_GL_DOUBLEBUFFER</code> Double buffer is a rendering technique which stores two separate buffers, hence the name. While one frame is being displayed on the screen, a secondary buffer is generated in the background, so that the render can “swap” to the other one quickly. This is used to reduce flickering and smoother rendering.</p>
</li>
<li><p><code>SDL_GL_DEPTH_SIZE</code> Depth size stores an integer value representing bits that should be allocated for the depth buffer (or z-buffer). These values are responsible for deciding whether or not a pixel should be rendered, based on its depth to previously rendered pixels. In essence, the depth buffer makes objects clearly behind each other, or far away objects not being rendered directly in front of the camera. <code>24</code> balances well between precision and memory usage, and is the most common value.</p>
</li>
</ul>
<p>Now that we’ve set the attributes for our window, we can create the window!</p>
<p><code>SDL_CreateWindow()</code> takes 6 different parameters, a title, the x and y positions of the top left pixel (relative to the computer screen), the window width and height, and <code>FLAGS</code></p>
<p>We use two flags: <code>SDL_WINDOW_SHOWN</code> and <code>SDL_WINDOW_OPENGL</code> the former indicating that the window should be shown immediately, the latter preparing your window for use with OpenGL. For more info, see <a target="_blank" href="https://wiki.libsdl.org/SDL2/SDL_WindowFlags">SDL2 Window_Flags values.</a></p>
<p>Now, you can create an OpenGL context, which is an environment responsible for the state and resources. This manages OpenGL’s rendering operations and interactions with the graphics system. In this context, we will be managing the OpenGL state, frame buffer, functions, shaders, textures, buffers, and viewport.</p>
<p>Last thing in this block of setting up the window, we’ll need to call <code>gladLoadGLLoader()</code> which will just initialize the GLAD library to load up function pointers with <code>SDL_GL_Get_Proc_Address</code></p>
<p>That wasn’t so bad, right? There are a lot of abstractions in OpenGL, and it’s just a matter of taking the time to slowly learn them and try new things. This block of code is crucial to preparing our window for OpenGL rendering. Now, let’s finish this window creation by making our main loop!</p>
<pre><code class="lang-cpp">    <span class="hljs-keyword">bool</span> done = <span class="hljs-literal">false</span>;

    <span class="hljs-keyword">while</span>(!done) {
        glViewport(<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, screen_width, screen_height);

        SDL_Event event;
        <span class="hljs-keyword">while</span>(SDL_PollEvent(&amp;event)){
            <span class="hljs-keyword">if</span>(event.type == SDL_QUIT) {
                done = <span class="hljs-literal">true</span>;
            }
        }
        glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
        SDL_GL_SwapWindow(window);
    }
    SDL_DestroyWindow(window);
    SDL_Quit();
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;
}
</code></pre>
<p>We’re going to create a boolean <code>done</code> that will control our while loop, which will handle the updating of our window.</p>
<p>We will define a <code>glViewport()</code> which, similar to <code>SDL_CreateWindow()</code>, takes the x and y pixels of the top left of our window, and how big we want to window width and height.</p>
<p>With <code>SDL_Event</code> we are checking to see if any events have happened, such as the user quitting out of the application, or clicking the keyboard. After that, we want to call <code>glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT)</code></p>
<p>While this function call isn’t exactly necessary for us now, it is responsible for clearing the depth and color information in between every call in the while loop, essentially resetting the frame for us to write the next one. Finally, we are “swapping” the secondary buffer that was being generated in place of the current buffer that is being rendered.</p>
<p>Go ahead and compile your code, keep in mind you’ll have to also compile the GLAD file and include the SDL2 and GLAD frameworks. This is the terminal command I used when compiling (I use MacOS):</p>
<p><code>clang++ -std=c++20 main.cpp ./glad/src/glad.c -o spinning_shapes -I/Library/Frameworks/SDL2.framework/Headers -I./glad/include -F/Library/Frameworks -framework SDL2</code></p>
<p>Go ahead and run your program, I named mine <code>spinning_shapes</code> and ran it with the command <code>./spinning_shapes</code></p>
<p><img src="https://miro.medium.com/v2/resize:fit:1400/1*3dYPV30AWSjAcekjkoEptg.png" alt="An empty window" /></p>
<p>If you did everything right, an empty window is what you’ll see! How exciting.</p>
<p>In the next post, we will be integrating shaders into our program so that we are able to start drawing onto the screen, and we will create a square mesh to render.</p>
<p>Thanks for reading!</p>
]]></content:encoded></item><item><title><![CDATA[Advanced Operating Systems]]></title><description><![CDATA[OS Structures
SPIN Approach
What are we shooting for in an OS structure?

Thin, only mechanisms should be in the kernel, not policies in the kernel itself (like microkernel)

Access to resources, without border crossing as much as possible (like DOS)...]]></description><link>https://hadicya.dev/advanced-operating-systems</link><guid isPermaLink="true">https://hadicya.dev/advanced-operating-systems</guid><category><![CDATA[advanced operating systems]]></category><category><![CDATA[cs6210]]></category><category><![CDATA[operating system]]></category><category><![CDATA[georgia-tech]]></category><dc:creator><![CDATA[Hadi]]></dc:creator><pubDate>Thu, 01 Jan 1970 17:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726000715926/b3a7f986-8218-4e81-ab55-71723ca0e9a8.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1 id="heading-os-structures">OS Structures</h1>
<h2 id="heading-spin-approach">SPIN Approach</h2>
<h3 id="heading-what-are-we-shooting-for-in-an-os-structure">What are we shooting for in an OS structure?</h3>
<ul>
<li><p>Thin, only mechanisms should be in the kernel, not policies in the kernel itself (like microkernel)</p>
</li>
<li><p>Access to resources, without border crossing as much as possible (like DOS)</p>
</li>
<li><p>Flexibility for resource management, so that resource management can be easily morphed to suit the needs of an application (like microkernel)</p>
</li>
<li><p>Flexibility without sacrificing protection and performance (like monolithic)</p>
</li>
</ul>
<h3 id="heading-approaches-to-extensibility">Approaches to Extensibility</h3>
<p>Capability based (Hydra OS (Wulf ‘81))</p>
<ul>
<li><p>Kernel Mechanisms (not policies) for resource allocation</p>
</li>
<li><p>Capability based resource access. A capability is an entity that can be passed from one to the other, cannot be forged, and can be verified.</p>
</li>
<li><p>Resource manager were built as coarse-grained objects to reduce border crossing overhead. Border crossing in Hydra OS would mean passing capability from one object to another, and validate the capability for entering a particular object. Building RM as coarse-grained objects limit the opportunities for customization and extensibility.</p>
</li>
<li><p>Hydra had good ideas for providing minimal mechanisms in the kernel, and having RM implement policies. Resource access of fundamental access through capabilities were heavy weight abstract notion for efficient implementation. But in practice, Hydra did not fulfill its goal of extensibility.</p>
</li>
</ul>
<p>MKernel based (e.g., Mach from CMU in 90’s)</p>
<ul>
<li><p>Mach focused on extensibility and portability</p>
</li>
<li><p>Provides limited mechanisms in the micro kernel, and implemented all services you’d expect from an OS as server processes that run as normal user level processes above the kernel.</p>
</li>
<li><p>Mach achieved its goal of extensibility and portability. Performance took a back seat.</p>
</li>
<li><p>This gave Micro Kernel based design bad press, since performance was not a priority, as OS is supposed to be very performant.</p>
</li>
</ul>
<p>SPIN Approach to Extensibility</p>
<ul>
<li><p>Keep a minimal kernel with its extensions in the same hardware address space. This avoids border crossings between the kernel and its extensions, greatly improving performance.</p>
</li>
<li><p>SPIN utilizes a strongly typed language as the basis of building the OS. In this way, SPIN also creates the necessary protections an OS needs. This means SPIN has a <strong>Compiler enforced modularity.</strong></p>
</li>
<li><p>Data abstractions provided by the programming language such as an object, serve as containers for <strong>Logical Protection Domains</strong>. We are no longer reliant on hardware address spaces to provide protection between different services and the kernel.</p>
</li>
<li><p>The kernel only provides the interfaces, and the <strong>LPD</strong> are the ones that actually implement the functionality that is defined in the interface functions.</p>
</li>
<li><p>There can be several different implementations of interface functions. Kind of like how C has header files. This offers flexibility, because applications can dynamically bind different implementations of the same interface functions.</p>
</li>
<li><p>SPIN makes extensions as cheap as a procedure call</p>
</li>
</ul>
<h3 id="heading-logical-protection-domains">Logical Protection Domains</h3>
<p><strong>Modula-3</strong> is a strongly typed language with built in safety and encapsulation mechanisms.</p>
<ul>
<li><p>It has automatic memory management, meaning there are no leaks.</p>
</li>
<li><p>Supports data abstraction called Objects. Allows exposing externally visible methods inside an object using generic interfaces.</p>
</li>
<li><p>Supports threads that can run in the context of the object, and allows raising exceptions (i.e. memory access violation)</p>
</li>
<li><p>Modula-3 allows the creation of logical protection domains.</p>
</li>
<li><p>It allows us to get the safety of a monolithic kernel, without having to put system code in a separate hardware address space.</p>
</li>
<li><p>Objects that implement specific services can be the desired granularity of the system designer. Fine-grained protection can be done via capabilities. For example:</p>
<ul>
<li><p>Individual hardware resources (e.g., page frame)</p>
</li>
<li><p>Interfaces that provide a certain functionality (e.g., page allocation module)</p>
</li>
<li><p>Collection of interfaces (e.g., entire virtual machine)</p>
</li>
</ul>
</li>
<li><p>Capabilities to objects can be supported as pointers. Capabilities in this sense are much lighter weight than previously discussed in the Hydra OS.</p>
</li>
<li><p>Pointers in Module-3 are type-specific, while in C, they are not.</p>
</li>
</ul>
<p><strong>Spin Mechanisms for Protection Domains</strong></p>
<p>There are 3 Mechanisms in SPIN to create protection domains and use them.</p>
<p><strong>Create</strong></p>
<p>Allows creating a logical protection domain. Initiating an object file with the contents and export the names that are contained as entry point methods inside the object, to be visible outside.</p>
<p><strong>Resolve</strong></p>
<p>Resolve is similar to linking 2 separate compiled files together. Resolves the names between two object files. Names between source + target, two logical protection domains. They are dynamically linked and bound together. Accessing methods inside the target domain, resource sharing happens at memory speeds. As efficient as a procedure call.</p>
<p><strong>Combine</strong></p>
<p>Create aggregation larger protection domain. Once names in source and target domain are resolved, an aggregate domain can be created. The entry points are a union between source and target, or any number of domains.</p>
<h3 id="heading-customized-os-with-spin">Customized OS with Spin</h3>
<p>The logical protection domain gives us the ability to extend SPIN to include OS services and make it all part of the same hardware address space. No border crossing between services and mechanisms provided by SPIN.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725986938587/76538e61-9083-46fa-80dd-a01752f5e9d3.png" alt class="image--center mx-auto" /></p>
<p>Two different extensions of SPIN can live on the same hardware framework. In this picture, while there is some common sub systems, process 2 uses a differently implemented memory manager than process 1.</p>
<h3 id="heading-spin-mechanisms-for-events">Spin Mechanisms for Events</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725987502286/3466f709-937f-4c5c-ba93-305b0bda25e8.png" alt class="image--center mx-auto" /></p>
<p>An OS needs to field events in order to process such events. SPIN has an event-based communication model. Services can register event handlers with the SPIN event dispatcher. It supports many mappings between event and event handler, such as 1-to-1 mapping, many-to-1 mapping, and 1-to-many mappings.</p>
<h3 id="heading-default-core-services-in-spin">Default Core Services in SPIN</h3>
<p>SPIN provides interface procedures for implementing services in the OS. Think of them like SPIN providing header files, that can have different Logical Protection Domain integrations. Once the LPD is properly instantiated, it becomes an extension of SPIN. There is no border crossing between SPIN and the extensions.</p>
<p>Memory Management:</p>
<ul>
<li><p>Physical Address: Allocate, Deallocate, Reclaim</p>
</li>
<li><p>Virtual Address: Allocate, Deallocate</p>
</li>
<li><p>Translation: Create/Destroy Address spaces, Add/Remove mapping between virtual pages and physical frames</p>
</li>
<li><p>Event Handlers: Page fault, access fault, bad address</p>
</li>
</ul>
<p>CPU Scheduling:</p>
<ul>
<li><p><strong>SPIN global scheduler</strong> interacts with the application (extension that is living atop SPIN, which could be an entire OS, or just an app) threads package. SPIN global scheduler decides at a Macro level, the amount of time that is given to a particular extension.</p>
</li>
<li><p>SPIN Abstraction <strong>Strand,</strong> which is a unit of scheduling that SPINs global scheduler uses. The semantics of strand is entirely defined by the extension.</p>
</li>
<li><p>Event handlers: Block, Unblock, Checkpoint, Resume</p>
</li>
</ul>
<h3 id="heading-conclusion">Conclusion</h3>
<p>SPIN introduces a flexible and customizable system where applications can extend kernel functionality by a dynamically adding code modules. These extensions allow the system to adapt its behavior according to the needs of specific applications, improving performance and flexibility.</p>
<p>This method, however, would be consider unsafe in some instances, without the leveraging of language-based protection mechanisms that SPIN imposes on the OS. It utilizes Modula-3, a type-safe programming language, to prevent any unauthorized or harmful code from compromising the stability or security of the system. This ensures applications can add custom functionality while not performing dangerous operations that can corrupt the kernel data.</p>
<p>By utilizing these methods, and executing kernel extensions written in Modula-3 within the same address space, SPIN can avoid the overhead of traditional context switches and data copying. The use of fine-grained efficient event handling and safe access to kernel data structures allows the system to achieve high performance. SPIN uses an fine-grained event-based model where extensions are triggered by specific system events.</p>
<p>This paper presents experimental results showing how SPIN can achieve performance levels comparable to or better than traditional monolithic OS. The authors demonstrate this through various benchmarks, including file system and network performance evaluations.</p>
<p>The SPIN OS offers a novel approach to OS design by providing a framework that allows for safe and efficient extensibility by utilizing language-based safety mechanisms, as well as using the same hardware space. This makes SPIN compelling, particularly in scenarios where customization and efficiency are crucial.</p>
<h2 id="heading-exokernel-approach">Exokernel Approach</h2>
<p>Exokernel comes from the fact that the kernel exposes hardware explicitly to the operating system extensions living above it. In the Exokernel approach, we are decoupling authorization of hardware from its actual use.</p>
<p>The request is processed in the Exokernel, and the hardware is then exposed through secure binding. The Exokernel then returns the encrypted key to the requesting library operating system. The semantics of how the resources are going to be used are irrelevant to the Exokernel and are up to the library OS. After all of this, the OS can utilize the resource by presenting the key it received to the Exokernel, in which the Exokernel will validate and then grant access.</p>
<p>In this way, the Exokernel serves as a sort of doorman between the OS and the Hardware.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725999471671/39bea3c9-02dc-4d72-8ebb-af550a5911c0.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-examples-of-candidate-resources">Examples of Candidate Resources</h3>
<p>TLB Entry</p>
<ul>
<li><p>Virtual to Physical mapping done by library.</p>
</li>
<li><p>Binding is presented to Exokernel with encrypted key.</p>
</li>
<li><p>Exokernel puts the mapping into the hardware TLB.</p>
</li>
<li><p>The process that will be used by this new virtual page mapping can be done multiple times without Exokernel intervention.</p>
</li>
</ul>
<p>Packet Filter</p>
<ul>
<li><p>Needs to be executed by the Exokernel every time there is a packet arrival, on behalf of the OS.</p>
</li>
<li><p>Predicates for incoming packet are loaded into kernel by library OS, which is a heavy-weight, but once the binding has been established, it does not incur the intervention by the Exokernel.</p>
</li>
</ul>
<h3 id="heading-implementing-secure-bindings">Implementing Secure Bindings</h3>
<p>There are three methods for implementing secure bindings.</p>
<ul>
<li><p>Hardware mechanisms, such as TLB entry, retrieving a physical page frame from the Exokernel, or portions of the frame buffer that is being used by the display, can be bound by the Exokernel to the OS. Once the OS has an encrypted key for the resource, it can use it any time it wants.</p>
</li>
<li><p>Software caching, such as <strong>“Shadow” TLB</strong>, which means caching the hardware TLB in a software cache for each OS to avoid the context switch penalties from switching from one OS to another.</p>
</li>
<li><p>Downloading code into Kernel, which is functionally equivalent to SPIN extensions, such as with the Packet Filter. However, Exokernel does not have the same protection benefits as SPIN, because Exokernel has arbitrarily defined code that is downloaded into the kernel.</p>
</li>
</ul>
<h3 id="heading-default-core-services-in-exokernel">Default Core Services in Exokernel</h3>
<p>Memory Management (in the example of a Page Fault)</p>
<ol>
<li><p>When a thread incurs a page fault, it is first fielding by the Exokernel.</p>
</li>
<li><p>The Exokernel knows which threads belong to which OS, so then it informs the appropriate Library OS through a registered handler.</p>
</li>
<li><p>The OS then services the page fault which may involve requesting a page frame from the Exokernel to host the specific page that is missing.</p>
</li>
<li><p>If this happens, the library OS will ask the Exokernel for a page frame, a binding will be created, and returning a new encrypted key.</p>
</li>
<li><p>The Library OS then presents the mapping to the Exokernel to be placed in the hardware TLB.</p>
</li>
</ol>
<p>Then, the Exokernel, if approved, will install the mapping in the TLB.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726004293468/41797016-3214-4fda-af00-d59e352c705d.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-memory-management-using-s-tlb">Memory Management using S-TLB</h3>
<p>When we have a context switch, one of the biggest sources of performance lost is the fact that locality is lost when there is a newly scheduled process. When we switch from one Library OS to another, the TLB is completely wiped, which is massive overhead.</p>
<p>This is where the <strong>Software TLB (S-TLB)</strong> comes in. The S-TLB is a data structure in the Exokernel that represents the respective mappings for an OS.</p>
<p>On TLB clear, it will dump its information onto the S-TLB of the OS it was previously on. When there is an OS context switch, it will pre-load the TLB from the S-TLB of the respective OS.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726005392682/bce77dc8-37b2-4be4-a56c-986a4d08f959.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-default-core-services-in-exokernel-cont">Default Core Services in Exokernel (cont.)</h3>
<p>CPU Scheduling maintains a linear vector of time slots. Time is divided into epochs of T<sub>1 </sub> to T<sub>n</sub> , called a <strong>Time Quantum</strong>. Each time quantum has a start and end time. These time quantums represent the time that is allocated to the Library OS’ that live atop the Exokernel. Each Library OS gets to mark its time quantum at startup in the linear vector of time slots.</p>
<p>If an OS misbehaves, such as taking more time in its time quantum than allowed, the Exokernel will remember and it will take time off from its time quantum the next time it is scheduled. During the time quantum, the OS has complete control of the processor, and the Exokernel will not get involved unless a process that is running on behalf of the OS incurs a page fault.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726013109622/50fddd4f-6614-4d6f-a86c-fd8f36d3f814.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-revocation-of-resources">Revocation of Resources</h3>
<p>The Exokernel keeps track of which resources are allocated to which Library OS’. At any given time, the Exokernel can revoke resources that it’s granted to any Library OS. The Exokernel can send a <strong>revoke</strong> call with a repossession vector, detailing the resources it is revoking from the Library OS.</p>
<p>The Library OS is then responsible to take corrective action to clean up and finalize tasks with the resources that are being revoked.</p>
<p>The Library OS can also seed the Exokernel with autosave options for resources the Exokernel wants to revoke. Ex. If an Exokernel decides to revoke page frames from Library OS, the Library OS could’ve seeded the Exokernel ahead of time such that it will dump the page frames onto the disk, on behalf of the OS.</p>
<h3 id="heading-code-usage-by-exokernel-examples">Code Usage by Exokernel (Examples)</h3>
<ul>
<li><p>Packet filter for de-multiplexing network packets</p>
</li>
<li><p>Run code in Exokernel on behalf of Library OS not currently scheduled (e.g. garbage collection for a specific App)</p>
</li>
</ul>
<h3 id="heading-putting-it-all-together">Putting it all Together</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726013565217/93b2a914-2074-47a1-bb0f-82d64d5a9454.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-exokernel-data-structures">Exokernel Data Structures</h3>
<p>Exokernel maintains the PE data structure on behalf of each Library OS. The PE data structures contains the entry points in the Library OS for dealing with the different kinds of program discontinuties, such as exceptions, external interrupts, system calls, and addressing context. Each PE structure is unique for every Library OS. This PE structure setup is functionally similar to the SPIN OS event handler.</p>
<p>In addition to the PE data structure, we also have the S-TLB for each OS as well. These are called the “guaranteed mappings.” The entry point associated with addressing context is particularly important for guaranteed mappings requested by the Library OS to have in the Exokernel PE, which is the set of TLB entries that are dumped into the S-TLB. Only the entries that have been guaranteed to be kept on behalf of a particular OS are dumped into the S-TLB.</p>
<p>Downloading code into the kernel allows first level interrupt handling on behalf of a Library OS. In a situation where an interrupt happens in a separate OS instead of the relative OS, this is particularly useful, as the downloaded code in the kernel can handle the interrupt instead of having issues with another OS receiving it.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726025770557/0292e826-fa86-4171-93ba-52015c65e57b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-performance-results-of-spin-and-exokernel">Performance Results of SPIN and Exokernel</h3>
<p>When you research papers, absolute numbers are meaningless. When analyzing performance results, you need to compare it to the competition of its time. Performance questions are always centered around space (memory) and time (performance).</p>
<blockquote>
<p>How much better is <strong>X</strong> approach over <strong>Y</strong> approach?</p>
<p>What is the code size difference between X OS vs. Y OS?</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726026053199/8942dc80-51df-4ed2-89de-9707c64e0204.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-l3-microkernel-approach">L3 Microkernel Approach</h2>
<p>The L3 Microkernel, by proof of construction, debunks the myths about the assumed low performance MKernel based approach. As a recap, the MKernel approach was deemed low performance because of it’s constant border crossing nature that ensured protection and extensibility, at the supposed cost of performance.</p>
<p>It is possible to construct an L3 Microkernel where microservices are still in their own protection domain, but in the same hardware address space. Therefore, it’s all about how efficient the implementation is.</p>
<h3 id="heading-strikes-against-microkernel">Strikes Against Microkernel</h3>
<p>Explicit costs:</p>
<ul>
<li><p>Kernel user switches, the cost of the border crossing is too high</p>
</li>
<li><p>Address space switches, the communication for cross protection domain calls involves flushing the TLB every time.</p>
</li>
<li><p>Thread switches and inter process communication have to be mediated by the MKernel, which involves many border crossings at every point.</p>
</li>
</ul>
<p>Implicit cost:</p>
<ul>
<li>Memory Effects, such as constant locality loss when going across address spaces.</li>
</ul>
<h3 id="heading-debunking-mkernel-myths-with-l3">Debunking MKernel Myths with L3</h3>
<p>L3 by proof of construction, systematically debunks all the myths in the “Strikes Against Microkernel” section.</p>
<h3 id="heading-user-kernel-border-crossing"><strong>User Kernel Border Crossing</strong></h3>
<p>The user to kernel based switching border crossing cost is not as high as thought. The L3 Microkernel can do this border cross including TLB and cache misses, in 123 processor cycles. Compared to the CMU’s Mach OS, which SPIN and Exokernel both compare themselves to, can do it in 900 processor cycles.</p>
<h3 id="heading-address-space-switches"><strong>Address Space Switches</strong></h3>
<p>In Address Space switches, the TLB is used to translate the virtual page number into a physical frame number.</p>
<p>On context switch, the TLB is flushed, which is costly when switching address space often. However, it is possible for it to not be flushed. The TLB can possibly store address based tags as well, meaning the entire TLB does not need to be flushed, depending on the situation.</p>
<p>In an architecture that supports address space tags in the TLB:</p>
<ol>
<li><p>When an entry is made in the TLB, a Process ID (PID) is stored with the entry.</p>
</li>
<li><p>When address translation happens, the virtual page number (VPN) is split into the tag and index (like normal), but this time we can get an address space tag to compare with the PID.</p>
</li>
<li><p>If there is a hit, then we can check the tag of the VPN and compare it to the tag in the TLB.</p>
</li>
<li><p>If both match, there is no need to flush the TLB on the context switch.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726079177117/0a694809-1058-4033-802c-28adbe3bf4d0.png" alt class="image--center mx-auto" /></p>
<p>Liedtke, the author of the L3 Microkernel approach, suggests taking advantage of whatever the architecture offers you in regards to avoiding TLB flush. For example:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726079490801/52e06167-d64c-427e-bf16-9cd5255aab18.png" alt class="image--center mx-auto" /></p>
<p>In x86 and PowerPC, segment registers are offered. Segment registers give an OS the opportunity to specify a range of virtual addresses that can be legally accessed by the currently running process. We can use segment registers to create a protection domain to bound addresses for a running process. The segment bounds are hardware enforced.</p>
<p><strong>What if a protection domain is so large that it needs all of the Hardware address space?</strong></p>
<p>If they are so large that it occupies the entire hardware address space, and the TLB does not support address based tagging, then you have to do a TLB flush on a context switch.</p>
<p>In this case, the explicit costs we’ve been talking about becomes insignificant compared to the much more massive implicit cost that is incurred. The cache effects of a large context switch of processes that take the entire hardware address space would have significant cost.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726080515915/2c7e9589-b0ad-4cf2-adcc-08aae9bdebba.png" alt class="image--center mx-auto" /></p>
<p>If we are switching in <strong>Small Protection Domains:</strong></p>
<ul>
<li>By taking advantage of architecture/hardware options given to you, careful construction of processes can render address space switching efficient.</li>
</ul>
<p>If we are switching in <strong>Large Protection Domains:</strong></p>
<ul>
<li><p>Switching cost is insignificant relative to the implicit costs</p>
</li>
<li><p>The cache effects and TLB effects dominate, are a much more significant implicit costs</p>
</li>
</ul>
<h3 id="heading-thread-switches-and-ipc">Thread Switches and IPC</h3>
<p>The explicit cost of a thread switch, in which the Microkernel begins running another thread, involves saving all volatile state of the processor. By proof of construction, it is shown to be competitive to SPIN and Exokernel approaches.</p>
<h3 id="heading-memory-effects">Memory effects</h3>
<p>Memory effects can be mitigated by carefully structuring the protection domains and the hardware address space. In the case that the protection domains are small, Liedtke suggests to pack all the protection domains in the same hardware address space, enforcing protection for the processes from one another by utilizing segment registers. In this way, the caches will be warm even on context switches.</p>
<p>In debunking the myths regarding address space switches, we also have strategies to mitigate memory effect issues as well.</p>
<h3 id="heading-why-was-border-crossing-in-mach-so-expensive">Why was border crossing in Mach so expensive?</h3>
<p>The significant focus on portability made it such that it was architecture independent to allow the Mach Microkernel to run on several different process architectures. This meant there was <strong>code bloat</strong> in Mach, because it has to have compatibility with any architecture it aims to run on. Specifically in Mach, there are architecture independent and architecture specific parts of the microkernel.</p>
<p>This means Mach has a <strong>large memory footprint</strong>, which means it has <strong>lesser locality</strong>, which means it has <strong>more cache misses,</strong> therefore there is <strong>longer latency for border crossings.</strong></p>
<p>Portability and performance contradict each other.</p>
<h3 id="heading-thesis-of-l3-of-os-structuring">Thesis of L3 of OS Structuring</h3>
<p>The principles of L3 are as follows:</p>
<ul>
<li><p>The microkernel should have <strong>minimal</strong> abstractions. This includes support for threads, address spaces, interprocess communication, and generating unique IDs. We need these 4, as ANY subsystem in an OS should have these in a microkernel.</p>
</li>
<li><p>Microkernels are processor-specific in implementation. Avoid portability if you want an efficient implementation of a microkernel. You need to exploit anything a certain architecture offers you.</p>
</li>
</ul>
<p>The right set of microkernel abstractions and <em>processor-specific</em> implementation, you can build efficient <em>processor-independent</em> abstraction at higher layers.</p>
<h2 id="heading-papers">Papers</h2>
<p><strong>Brian Bershad et al., "Extensibility, Safety and Performance in the SPIN Operating System ", Proceedings of the 15th ACM Symposium on Operating System Principles, December 1995.</strong></p>
<p><strong>Dawson R. Engler, Frans Kaashoek and James O'Toole, "Exokernel: An Operating System Architecture for Application-Level Resource Management ", Proceedings of the 15th ACM Symposium on Operating System Principles, ACM, December 1995.</strong></p>
<p><strong>J. Liedtke, " On Micro-Kernel Construction ", Proceedings of the 15th ACM Symposium on Operating System Principles, ACM, December 1995.</strong></p>
<p><strong>J. Liedtke, "Improved Address-Space Switching on Pentium Processors by Transparently Multiplexing User Address Spaces ", GMD Technical Report No. 933, November 1995 (self-study).</strong></p>
<h1 id="heading-virtualization">Virtualization</h1>
<h2 id="heading-introduction-to-virtualization">Introduction to Virtualization</h2>
<h3 id="heading-platform-virtualization">Platform Virtualization</h3>
<p>A black box that can run atop a virtual platform. The end user is satisfied as long as the platform has the same functionality as what the system appears to have.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726110896006/4d00ce62-960e-44e6-b5d7-9c485d28dc18.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-utility-computing">Utility Computing</h3>
<p>Multiple apps and OS’ can be run atop shared hardware resources. Different apps are run and they lower overall server costs, due to the fact that computing may not be used all at the same time.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726111018678/0af40fa0-a2cf-4acb-b3d3-c5585b6e2a77.png" alt class="image--center mx-auto" /></p>
<p>Utility computing is promoted by data centers worldwide, and it is how different end-users are able to access more computing resources for their specific peak times, versus having to purchase their own computers.</p>
<h3 id="heading-hypervisors">Hypervisors</h3>
<p>How is it possible that multiple different OS’ can run atop the same shared hardware resources? Who decides who gets what? A hypervisor is like an OS of OS’.</p>
<p>There are native (bare metal) hypervisors, which runs directly atop the shared hardware, and the OS’ run atop the hypervisor.</p>
<p>The second type is called hosted hypervisors. The hosted hypervisor runs atop a host OS, and allows the user to emulate the function of other OS’.</p>
<h3 id="heading-connecting-the-dots">Connecting the Dots</h3>
<ul>
<li><p>The concept of virtualization started in the 70s by IBM VM370. The intent was to give the illusion that every user on a computer had ownership of said computer.</p>
</li>
<li><p>Microkernels were introduced in the late 80’s and 90’s</p>
</li>
<li><p>Extensibility of OS was investigated in the 90’s</p>
</li>
<li><p>The SIMOS was introduced in the late 90’s and laid the basis for the modern resurgence of virtualization technology, and was the basis for VMware.</p>
</li>
<li><p>Xen and VMware were based on papers laid out in the early 2000’s.</p>
</li>
</ul>
<p>Now, accountability for usage in data centers. Companies can provide cloud virtualization services and bill users separately. Virtualization has made computing another utility, like water or electricity.</p>
<h3 id="heading-full-virtualization">Full Virtualization</h3>
<p>The Full Virtualization framework leaves the OS basically untouched, allowing you to run the unchanged binary of the OS atop the hypervisor. The OS’s are run as user level processes atop the hypervisor, which means they do not get the same privileges as running them directly on bare metal.</p>
<p>When the OS runs some privileged instructions, those instructions go through a trap in the hypervisor, where the hypervisor will emulate the intended functionality in the OS.</p>
<p>In some architectures, the privileged actions may <em>fail silently</em>. The hypervisor will try to look for the issues that may arise in those OS’, and then if those failures do happen, it will try to capture and fix them. Early on, this was a more significant problem, but now architectures are much better at supporting virtualization natively.</p>
<h3 id="heading-para-virtualization">Para Virtualization</h3>
<p>Another approach to virtualization is to modify the source code of the guest OS. Doing this, we can avoid problematic instructions, and include optimizations between the OS and the underlying hardware.</p>
<p>Less than 2% of the Guest OS code needs modification with para virtualization. This was proven in Xen by proof of construction.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726112259329/fd937d68-bf68-41a5-8833-8414e29b5151.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-big-picture">Big Picture</h3>
<p>Virtualize Hardware:</p>
<ul>
<li><p>Memory Hierarchy</p>
</li>
<li><p>CPU</p>
</li>
<li><p>Devices</p>
</li>
</ul>
<p>Effect data and control between guests and hypervisor</p>
<h2 id="heading-memory-virtualization">Memory Virtualization</h2>
<h3 id="heading-memory-hierarchy">Memory Hierarchy</h3>
<p>Cache is physically tagged, so there isn’t much work that needs to be done regarding virtualizing them. The main struggle comes when handling virtual memory, mapping virtual addresses to the physical, which is a key functionality in the memory management in any subsystem.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726160281152/b839e138-e9a8-4ca5-ad8b-b6f76457ed4c.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-memory-subsystem-recall">Memory Subsystem Recall</h3>
<p>Each process is in its own protection domain, and usually, a separate hardware address space. The OS maintains a page table on behalf of each of the processes. The page table is an OS data structure that holds the mapping between the virtual page numbers and the physical pages where those virtual pages are contained in the main memory of the hardware.</p>
<p>The physical memory is continuous, but the virtual addresses are not pointing to contiguous sections of physical memory, instead the locations are scattered across physical memory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726160516540/89a35e64-0d37-46d7-9e51-33782f4b6651.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-memory-management-and-hypervisor">Memory Management and Hypervisor</h3>
<p>In the virtualized setup, the hypervisor sits between the guest OS and the hardware. Each process in each OS is in its own protection domain, which means each process has a distinct page table in it’s OS. The hypervisor doesn’t know about these page tables stored in the OS’.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726160768259/729bab39-f895-46ea-83b0-d21319750390.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-memory-manager-zoomed-out">Memory Manager Zoomed Out</h3>
<p>The physical memory AKA machine memory is in the control of the hypervisor. When the OS sees its “physical memory” it is actually an illusion shown by the hypervisor. It is not actual contiguous physical memory, but instead the hypervisor places the regions of physical memory in the machine memory. The needs of an OS may change through its lifespan, and the hypervisor would potentially need to allot more machine memory to an OS in a non-contiguous manner.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726161031050/84059220-0b95-4d0e-ab68-fe8688842805.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-zooming-back-in">Zooming Back In</h3>
<p>In a non-virtualized setting, the virtual page number goes through the page table and results with a physical page number.</p>
<blockquote>
<p>Non-Virtualized Setting:</p>
<p>VPN → PT → PPN</p>
</blockquote>
<p>In a virtualized setting, however, there is another level of indirection. The physical page number of the OS has to be mapped to the machine memory / page numbers (MPN). The mapping between the physical page number (PPN) and the machine page number (MPN) is kept in a shadow page table (SPT).</p>
<blockquote>
<p>Virtualized Setting:</p>
<p>VPN → PT → PPN → S-PT → MPN</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726161390337/ba669617-7cbe-45a5-a11c-80afa2954026.png" alt class="image--center mx-auto" /></p>
<p>In the case of full virtualization, the guest OS does not know it is not running on bare metal, and therefore the hypervisor stores the PPN → MPN mappings.</p>
<p>In the case of para virtualization, the guest OS is aware it is being virtualized, and it can and usually does store the PPN → MPN mappings instead of the hypervisor.</p>
<h3 id="heading-shadow-page-table">Shadow Page Table</h3>
<p>In many architectures the CPU uses the page table for address translation. In the case of virtualization, the regularly used hardware page table is now used as the shadow page table.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726161789309/f0d60fc1-29a5-411a-891d-85d86ad87e04.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-efficient-mapping-full-virtualization">Efficient Mapping (Full Virtualization)</h3>
<p>When a fully virtualized guest OS attempts to map a VPN to PPN through the page table, it requires a privileged action. This initiates a trap in the hypervisor, where the hypervisor will correspond the particular VPN to an entry in the S-PT. The translations are installed into the TLB / hardware PT.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726162081323/a72ff21e-92ea-4fc2-bd6f-9735b63e0eff.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-efficient-mapping-para-virtualization">Efficient Mapping (Para Virtualization)</h3>
<p>In a para virtualized setting, the guest OS is aware that the physical memory is non-contiguous. The burden of efficient mapping can be moved from the hypervisor to the guest OS.</p>
<p>In Xen (a paravirtualized hypervisor), a set of <strong>hypercalls</strong> is provided for the guest OS to tell the hypervisor about the changes made to the hardware page table.</p>
<p>Guest OS makes hypercall to the hypervisor, allowing the guest OS to, for example, initialize a page frame, and then ask the hypervisor to create a page table on behalf of the OS with that initialized page frame. Other supported actions include switching as well as updating page tables.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726175673032/c462319d-6e0e-4de6-9cb2-b96dd578a6c0.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-dynamically-increasing-memory">Dynamically Increasing Memory</h3>
<p>The hypervisor must be ready to handle the different memory needs that guest OS’ require by allocating them regions on the machine memory. If all the machine memory is taken up, the hypervisor can take back a portion of physical memory from another OS. Although, some memory issues may be ran into, it is also possible for the hypervisor to request more memory from the OS, so that it can voluntarily give some up to its peers.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726175893843/fe822a23-4cdc-4f24-ae2c-4e3278d866d9.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-ballooning">Ballooning</h3>
<p>The technique, <strong>ballooning</strong>, has a special device driver installed in every guest OS, regardless of full or para virtualization. The <strong>balloon</strong> device driver is the key for managing memory pressures that may be experienced by a virtual machine in a virtualized setting.</p>
<p>If a hypervisor needs more memory, it can contact a guest OS’ balloon driver through a private channel that exists between the balloon and the hypervisor. The hypervisor can then instruct the balloon to inflate itself in the guest OS, making it so the balloon requests more memory from the guest OS. After the balloon has gotten its necessary memory, it can return the physical memory gathered to the hypervisor.</p>
<p>If a hypervisor needs less memory, it can tell the guest OS’ balloon driver to deflate. The balloon deflating can release memory into the guest OS.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726176304814/01294c25-ced0-4376-97c6-8ea13ee3a645.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-sharing-memory-across-virtual-machines">Sharing Memory Across Virtual Machines</h3>
<p>It is possible, depending on the situation, to share space in the machine memory. For example, if two VMs are both running the same versions of Linux, with the same versions of Firefox, it is possible that in order to not store duplicate data, aspects of the software’s virtual page memory can be pointed to the same place in the machine memory.</p>
<p>One way to do it, is with cooperation between the the VMs and the hypervisor, where there are hooks that can inform each other of the situation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726176601436/adc2fb49-9019-42e9-909d-a55b675f8363.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-vm-oblivious-page-sharing">VM Oblivious Page Sharing</h3>
<p>An alternative, is for the VMs to be completely oblivious to the page sharing that is happening. The hypervisor can have a hash table that contains content hash of the machine pages.</p>
<ol>
<li><p>Take the contents of a page, create a content hash representing that page which corresponds to the PPN of the specific memory in the VM.</p>
</li>
<li><p>Then, take the hash, and look through the hypervisor hash table to see if there is a match between the hash and any page currently in the machine memory.</p>
</li>
<li><p>If there is a match, we have a <em>hint</em> that the data stored is a duplicate. For example, the other VM for which the matched data is a part of, could’ve modified the data in some kinda way.</p>
</li>
<li><p>Then, a full comparison must be made, in order to verify it is in fact duplicate data.</p>
</li>
<li><p>If there is a successful match, we can newly map the VM memory to the same duplicate spot in machine memory. Once we have done that, the reference table can increment a count representing how many maps there are to that place in memory.</p>
</li>
<li><p>The entries are then marked as <strong>copy on write</strong> entries, indicating that they can share the page as long as they are reading it, but if any of them try to write to it, a copy of the original page will be made and things will be remapped.</p>
</li>
<li><p>After this successful match process is done, the page frame can be freed up in the machine memory.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726177888039/870a5cca-cb73-491c-bd1e-03998bc5b9dc.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-memory-allocation-policies">Memory Allocation Policies</h3>
<p>Different policies are in place for allocating and reclaiming memory. The goal of virtualization is maximization of the utilization of resources. Virtualized environments use different policies for memory allocation, such as:</p>
<ul>
<li><p><strong>Pure share based approach:</strong> You pay less, you get less. The data center gives you a certain amount of resources directly correlated with the money you pay for said memory. This can potentially lead to holding if a VM gets a bunch of resources and holds onto it without using it.</p>
</li>
<li><p><strong>Working-set based approach:</strong> If a VM process / working set needs more memory, you give it more memory, if it needs less memory, you give it less. However, some might feel that if you paid for a certain amount of resources, you should be able to always get those resources.</p>
</li>
<li><p><strong>Dynamic idle-adjusted shares approach:</strong> Mixes both approaches. Tax idle pages more than active pages. If the resources I’ve given you are being actively used, more power to you. But if you are just holding it, the resources are taken away or “taxed.” Different data centers can have different taxation rates, with higher rates promoting a more “wealth-distribution” approach, and lower rates corresponding to a more pay for what you get approach.</p>
<p>  With a tax-rate somewhere in the middle, most idle memory can be reclaimed, but because its not being taxed at 100%, sudden working set increases are allowed to happen.</p>
</li>
</ul>
<h2 id="heading-cpu-amp-device-virtualization">CPU &amp; Device Virtualization</h2>
<p>We aim for the guest OS on a hypervisor to have the illusion of owning the CPU. We also want the hypervisor to field events arising due to the execution of a process that belongs to a parent guest OS.</p>
<h3 id="heading-cpu-virtualization">CPU Virtualization</h3>
<p>In each guest OS, the processes are already being multiplexed by the OS even in a non-virtualized setting. The hypervisor has to give an illusion of ownership of the CPU to each guest OS. The hypervisor needs a precise way to account the time each guest OS is allocated on the CPU.</p>
<p>It is possible for the hypervisor to give the guest OS’ a proportional share, depending on the service agreement a VM has with the hypervisor. There is also a fair share scheduler, which gives a fair (equal) share to each guest OS.</p>
<p>In either case, the hypervisor must account for time that is taken by another guest OS <em>during</em> a guest OS’ time, such as is the case with external interrupts. If the time is taken away during a guest OS’ time, it will be rewarded back in the future.</p>
<p>Events that are currently executing on the CPU as dictated by a process running on it need to be delivered to the parent guest OS, in both the full and para virtualized settings. A process may incur a system call, page fault, exception, or external interrupt. All of these discontinuities have to be passed up by the hypervisor to the guest OS. The hypervisor needs to deliver these as software interrupts to the respective guest OS.</p>
<p>Some of the ways to handle these discontinuities, however, require privileged access to be able to handle them through the guest OS. In a full virtualization environment, it is possible for the guest OS to attempt a privileged action, get trapped by the hypervisor, and then fail silently. The hypervisor needs to account for potential quirks in order to handle them.</p>
<h3 id="heading-device-virtualization">Device Virtualization</h3>
<p>For full virtualization, the OS’ believe they have full ownership of the devices already. In actuality, the trap and emulate approach is in place by the hypervisor.</p>
<p>Para virtualization has much more interesting opportunities for innovation. It is possible for the hypervisor to create clean device abstractions for the guest OS. Shared buffers can also be exposed for the guest OS. There can also be innovations in the communication methods themselves, between the guest OS and the hypervisor.</p>
<h3 id="heading-control-transfer">Control Transfer</h3>
<p>For full virtualization, control transfer happens implicitly. The guest OS attempts a privileged instruction, the hypervisor traps it. Software interrupts (events) from the hypervisor to the guest OS.</p>
<p>For para virtualization, control transfer happens explicitly through hypercalls from the guest into the hypervisor. Software interrupts (events) from the hypervisor to the guest OS. The guest has control via hypercalls on when event notifications need to delivered.</p>
<h3 id="heading-data-transfer">Data Transfer</h3>
<p>For full virtualization, data transfer is implicit.</p>
<p>For para virtualization (e.g., Xen), data transfer is explicit, offering opportunities to innovate.</p>
<p>Xen provides asynchronous I/O rings. Which is a data structure that is shared between the guest and Xen for communication. Any number of the rings can be allocated for handling all the device I/O needs for a particular guest domain. The I/O ring itself is just descriptors that are available in the data structure. Requests from the guest OS can be placed in the I/O ring by populating the descriptors. Every descriptor is a unique I/O request coming from the guest OS, with a unique ID. When Xen is done processing a request, it will place the result in the same ring as the descriptor with the same unique ID.</p>
<p>This is a producer-consumer relationship, where the guest is the request producer, where it will write to a shared pointer, viewable by Xen, and Xen is the request consumer. In regards to responses, the relationship is flipped, and Xen is the response producer, where it will write to a shared pointer, viewable by the guest, while the guest is the response consumer.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726247225128/90417f20-b6fa-49f1-ae64-6c4f8993db59.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-control-and-data-transfer-in-action">Control and Data Transfer in Action</h3>
<p>For Xen Data Transfer in action, we’ll use Network Virtualization as an example.</p>
<p>Each guest has 2 I/O rings. One is used for transmission, the other is for reception. In the case of transmission:</p>
<p>If the guest wants to transmit packets, it will enqueue descriptors in the transmit I/O ring via hypercalls provided by Xen.</p>
<p>The packets that need to be transmitted are not copied into Xen, but the buffers that contain the packets are in the guest OS buffers.</p>
<p>Pointers to said buffers are embedded in the descriptors that have been enqueued for transmission by the hypervisor.</p>
<p>For the duration of the transmission, the pages associated with the network packets are page pinned.</p>
<p>A round robin packet scheduler is used by Xen in order to transmit packets from different VMs.</p>
<p>In the case of reception:</p>
<p>When Xen receives a network packet for a particular guest OS, it exchanges the receive packet for one of the guest OS page that is being already provided to Xen as the holding place for incoming packets.</p>
<p>A guest OS will pre-allocate network buffers, which are pages owned by a guest OS.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726248126652/15760efb-7c9a-4055-b715-76413c825318.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-disk-io-virtualization">Disk I/O Virtualization</h3>
<p>Disk I/O virtualization works quite similarly to network virtualization. Every VM has an I/O ring which is dedicated for Disk I/O. The communication strives to avoid any copying into Xen. By enqueueing pointers to pre-allocated guest OS buffers, we can avoid any copying.</p>
<p>Xen may reorder requests from computing domains in order to make the I/O throughput efficient. There may be situations where such requests reordering may be inappropriate from the semantics of the I/O operation. In these situations, Xen provides a reorder barrier for guest OS’ to enforce operations to be done in the order they’ve been given in.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726248605485/0f9bbea6-8e9c-4b77-8830-b222954a2c09.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-measuring-time">Measuring Time</h3>
<p>VMs have good measures of space and time of each types of usage for CPU, Memory, Storage, and Network, in order to bill their clients appropriately.</p>
<h3 id="heading-xen-and-guests">Xen and Guests</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1726248981914/9d5ad0f6-da4d-476b-a2ef-c9ec3d439021.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-papers-1">Papers</h2>
<p><strong>Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield, "Xen and the Art of Virtualization ", SOSP 2003.</strong></p>
<p><strong>Carl Waldspurger, “Memory Resource Management in VMware ESX Server”, OSDI, 200.</strong></p>
<h1 id="heading-parallel-computing">Parallel Computing</h1>
<h2 id="heading-shared-memory-machines">Shared Memory Machines</h2>
<p>All 3 Machine Model structures have CPUs, an interconnection network, and memory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727710629631/e0a75c38-b1d2-4611-9fe1-628143830862.png" alt class="image--center mx-auto" /></p>
<p>In dance hall architecture, you have CPUs on the opposite side of memory that are connected via an interconnection network.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727710717257/1f04fa9c-eb04-4855-ada2-3baf49cbd29c.png" alt class="image--center mx-auto" /></p>
<p>In SMP, each CPU takes the same amount of time to access memory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727710893457/2a324e08-3c20-4789-a5d0-913941a52f21.png" alt class="image--center mx-auto" /></p>
<p>In DSM, each CPU has its own memory, and can still access all other CPU memory through the interconnection network.</p>
<h3 id="heading-shared-memory-and-caches">Shared Memory and Caches</h3>
<p>When the CPU wants to access a memory location, it first goes to the cache, and if its not it will fetch it from main memory and put it into the cache.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727711013495/3b640216-e532-48b6-893e-056e988449e2.png" alt class="image--center mx-auto" /></p>
<p>In a multi-processor, it acts the same way. However in the multi approach, if some value stored in more than one cache is changed, how do we update it across all caches? This is a cache coherence problem. Who should ensure this consistency?</p>
<p>This requires an agreement between hardware and software, called a memory consistency model.</p>
<h3 id="heading-memory-consistency-model">Memory Consistency Model</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727711400417/06833532-8ac1-466d-8699-20f5e8b8157e.png" alt class="image--center mx-auto" /></p>
<p>There is no guarantee that two different processes will run steps before or after other steps are ran in a process.</p>
<p>The Sequential Consistency model will maintain the program order + arbitrary interleaving between processes. Its kind of like a merge shuffle for a casino shuffle.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727711623052/921f1d69-1531-4c5d-9a96-cb670bc8bf66.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727711894836/e1eb02c8-5a42-4b54-98f2-11ed03d75b06.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-hardware-cache-coherence">Hardware Cache Coherence</h3>
<p>Suppose a particular memory location writes to a memory location that has a shared piece of memory across another cache. In that case, the hardware will invalidate all the other instances of that memory in a cache. This is called <strong>write invalidate</strong></p>
<p>In the <strong>write update</strong> scheme, however, the hardware updates it instead of invalidating it.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727712097859/5dc88900-f05c-4993-b928-0f0b064ce6e1.png" alt class="image--center mx-auto" /></p>
<p>Keep overhead to a minimum, which will continue to grow as more CPUs are added, among other things as the system grows.</p>
<h3 id="heading-scalability">Scalability</h3>
<p>There is overhead that increases as more processors are added.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727712185623/bd81215b-1d07-4f0e-ace5-7df87ead606c.png" alt class="image--center mx-auto" /></p>
<p>To optimize performance, don’t save memory across threads as much as possible.</p>
<h2 id="heading-synchronization">Synchronization</h2>
<h3 id="heading-summary">Summary</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727712448951/92b69054-ab95-4653-be0f-e991a0b1b56f.png" alt class="image--center mx-auto" /></p>
<p>A lock locks data so that operations done on it are protected and can only be done by one particular process. An exclusive lock ensures that only one process can access at a time, but a shared lock allows simultaneous use, such as in examples of everyone reading the same thing.</p>
<h3 id="heading-synchronization-primitives">Synchronization Primitives</h3>
<p>Barrier synchronization ensures that all threads need to catch up to the barrier before they all continue.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727712642377/d9b8a8a9-b788-4356-a552-a63e470ba8a0.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-atomic-operations">Atomic Operations</h3>
<p>We need an RMW (Read Modify Write) semantic for atomic instructions.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727713046139/0499f757-29ab-46bd-a6f9-3c3865539068.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727713113608/4acf293b-9578-4382-9723-5c05031c8dfe.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-scalability-issues-with-synchronization">Scalability Issues with Synchronization</h3>
<ol>
<li><p>Latency: If a thread wants to acquire a lock, there is time spent by a thread acquiring a lock.</p>
</li>
<li><p>Waiting Time: If a thread wants to acquire a lock, how long does it need to wait before it has its turn with the data?</p>
</li>
<li><p>Contention: If a thread uses a lock and releases it, maybe there is a bunch of threads waiting to access the lock and have to contend for access to the lock. How long does this contention take?</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727713317745/3a4f80bd-677f-40a7-bc35-7049e63fe187.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-naive-spinlock">Naive Spinlock</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727714346142/4acedd39-3ff1-4a6a-8bae-6f6315b1bda2.png" alt class="image--center mx-auto" /></p>
<p>The problem with this is there is too much contention, it does not exploit caches and disrupts useful work.</p>
<h3 id="heading-caching-spinlock">Caching Spinlock</h3>
<p>In this, we assume that through the system bus, the hardware ensures that cache is coherent.</p>
<p>The CPUs spin locally in their own cache when they are waiting for the lock to release, and when the lock is unlocked it will try to access it, and if not, it will continue to spin locally.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727714995178/5f7098b5-f261-4898-8ae8-5bb9768f4d01.png" alt class="image--center mx-auto" /></p>
<p>If there is high contention, it will take O(n²) bus transactions where n is the number of processes trying to access the lock at a given time.</p>
<h3 id="heading-spinlocks-with-delay">Spinlocks with Delay</h3>
<p>Every processor will have different delays, that will dynamically increase every time the lock is checked. This doesn’t require spinning on a cached copy.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727715358167/3772c408-da01-4097-a3d8-f8a0a4c7505b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-ticket-lock">Ticket Lock</h3>
<p>With the ticket lock model, we can ensure fairness by using a queue approach. For example, at a popular deli, you can take a ticket so that you know how many people are ahead of you, and when you will be ready to be served.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727715517254/b0e7b518-39e8-4829-8a2e-939eb25a6dba.png" alt class="image--center mx-auto" /></p>
<p>This achieves fairness but causes extra contention on the network since we need to keep track of the now-serving integer increasing.</p>
<h3 id="heading-spinlock-summary">Spinlock Summary</h3>
<ol>
<li><p>Read with T+S: No fairness</p>
</li>
<li><p>T+S with Delay: No fairness</p>
</li>
<li><p>Ticket Lock: Fair but noisy</p>
</li>
</ol>
<h3 id="heading-array-based-queueing-lock">Array-based Queueing Lock</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727717745363/600202d5-7d09-4d7d-8fa0-70352ccafa72.png" alt class="image--center mx-auto" /></p>
<p>In this style, we can make a fair and non-noisy queue for getting a lock. However there is a higher space complexity, but that is the only real downside.</p>
<h3 id="heading-linked-based-queueing-lock">Linked Based Queueing Lock</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727718515158/6df660d4-6bce-4cb4-bed9-f5cf818bb0a7.png" alt class="image--center mx-auto" /></p>
<p>Andersons Array based queue lock tends to be slightly faster, as linked lists have more overhead.</p>
<h3 id="heading-summary-1">Summary</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727718961186/bf3ed472-1482-4277-8b2c-9e5139bc60b6.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-communication">Communication</h2>
<h3 id="heading-barrier-synchronization">Barrier Synchronization</h3>
<p>All threads have to wait until all threads reach the barrier.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727719764175/64a758f0-b8a9-466c-aeb6-4dc35a21eb51.png" alt class="image--center mx-auto" /></p>
<p>This has a problem though. Before the last processor sets count to N, other processors may race to the next barrier and go through.</p>
<p>We need to add another condition so that the count is properly set. But this has two spin loops, and that’s not optimal. We want one spin loop.</p>
<h3 id="heading-sense-reversing-barrier">Sense Reversing Barrier</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1727720105485/1a77cb27-8880-4390-a1e0-bf6655c53cfc.png" alt class="image--center mx-auto" /></p>
<p>There is too much sharing in this approach.</p>
<h3 id="heading-tree-barrier">Tree Barrier</h3>
<p>A more scalable version of this approach is by using the tree barrier divide and conquer approach.</p>
<h3 id="heading-tournament-barrier">Tournament Barrier</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728778157761/ec38f139-9b14-4d62-8a4e-8f3bcfb43115.png" alt class="image--center mx-auto" /></p>
<p>This barrier is organized in the form of a tournament with n players and log(n) rounds. In the example with 8 players, there will be 3 rounds. There will be 4 matches in the first round, where the tournament will always be rigged where the winner is predetermined.</p>
<p>If the processes are executed on a shared-memory machine, then the winner can sit and wait for the losing processor to inform the winner that they have won. This is especially useful if you do not have a cache-coherent multi-processor (ex. NCC-NUMA).</p>
<p>The winners of each round will go to the next round, where the winners are still predetermined, all the way up until the final round, which in our example is round 3.</p>
<p>The main purpose of this arrangement is that the spin location for each process waiting on the other process is statically determined at every level.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728778739143/cd11ef08-ab18-47b3-ab42-eae1791b814f.png" alt class="image--center mx-auto" /></p>
<p>When the champion of the tournament is declared, we know that everyone has arrived at the barrier, and at this point, the champion is the only one who knows. The champion will then wake up the loser of that round, and then they will go back to every level and wake up the loser of the previous match, and so on. After everyone is awake, the next phase of computation can begin.</p>
<h3 id="heading-dissemination-barrier">Dissemination Barrier</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728786381970/61926703-e254-49db-9925-5a147cb13515.png" alt class="image--center mx-auto" /></p>
<p>This form of barrier utilizes information diffusion. N doesn’t need to be a power of 2. It’s like a “well-orchestrated gossip protocol” where in each round, a processor will send a message to another ordained processor. The particular processor that its going to choose to send it to is dependent on the round it’s in.</p>
<p>For each round up to k rounds, P at position i will send to P at position (i + 2^k) % n. Each of the processors makes an independent decision that the round is over based on the fact that they’ve both received and sent a message. There are O(n) communication events per round.</p>
<h2 id="heading-mpi">MPI</h2>
<p>Message Passing Interface (MPI) exists when messages must be passed between computers in a distributed computing context. This means we need a protocol for sharing information between processes across multiple computers that don’t share the same memory, CPUs, etc. This is where MPI comes in.</p>
<p>Communication Domain:</p>
<ul>
<li><p>Processes that can communicate with each other</p>
</li>
<li><p>Stored in communicator</p>
</li>
<li><p>Communicator type: MPI_Comm</p>
</li>
<li><p>Predefined default: MPI_COMM_WORLD</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728765708880/036d7267-74b5-407c-b12e-852cd78e5296.png" alt class="image--center mx-auto" /></p>
<p>MPI_Init: Initializes all the data structures and begins running MPI. This is called before any other MPI call.</p>
<p>MPI_Finalize: Turns off communications, frees memory, etc. before exiting back to the OS.</p>
<p>MPI_Comm_size: Allows us to get the number of processes that are communicating inside the communicator.</p>
<p>MPI_Comm_rank: This “tells us who we are”, which is an index in the communicator for a local process.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728768556959/7c2a0e85-6fb2-4238-b0f3-8d33dfc5eadf.png" alt class="image--center mx-auto" /></p>
<p>MPI_Send: When we want to transmit something from our process to another, we call this. The buffer is passed through as a pointer to where our data is stored, the count is the number of objects of type datatype we want to send. The dest is the rank (recall earlier rank means the local index of process on communication) where the data needs to be sent. The tag is a value that can be used to identify a type of message such that it can be distinguished as a certain type of message according to the way we write our program.</p>
<p>MPI_Recv: The buffer points to memory for where the data should be stored. The source is the rank of the sending messenger. The status parameter gives us information about the success/failure of this communication from another process. Everything else is about the same/can be implied from the MPI_Send function.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728768758293/6274475f-3de1-4858-8488-27977c95c562.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728769806966/6296db36-9acb-4978-927a-060af2cdfb12.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-distributed-systems">Distributed Systems</h1>
<h2 id="heading-definitions">Definitions</h2>
<p>The individual autonomy of a distributed system is what separates it from general parallel computing.</p>
<h3 id="heading-what-is-a-distributed-system">What is a distributed system?</h3>
<p>A distributed system is interconnected by some kind of LAN/WAN network (fiber, cable, satellite, etc.). There is no physical shared memory between nodes. The event computation time; the computational time on a single node to do some meaningful processing is called “event computation time” called te. And the communication time/messaging time is called tm. The third property of a distributed system is that the time for communication between the nodes in the system is significantly larger than the event computation time. tm » te.</p>
<p>Lamport has a definition for a distributed system “A system is distributed if the message transmission time, tm, is not negligible to the time between events in a single process.” Even a cluster is a distributed system. The importance of inequality is in the design of algorithms that are going to span the nodes of the network.</p>
<h3 id="heading-happened-before-relationship">“Happened Before” Relationship</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731278894935/44826cad-dbdb-467f-a91f-143f267e8dfc.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731279139036/6ca31437-393b-47ff-8c53-83e084a2525f.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-lamport-clocks">Lamport Clocks</h2>
<p>Each node knows its own events, and its communication events. These are the only events that each node knows about.</p>
<p>Lamports logical clock builds on this idea, such that each event will have a time stamp.</p>
<ol>
<li><p>We will have a local clock for each event in own event times, which monotonically increases. C1(a) &lt; C1(b)</p>
</li>
<li><p>For communication events, the timestamp of the receiver event will be greater than that of the sender. It also depends on the current state of the local counter, and must be greater. C1(a) &lt; C2(d) =&gt; Choose C2(d) = Max(C1(a)++, C2)</p>
</li>
</ol>
<ol start="3">
<li><p>Concurrent Events (b and d) have arbitrary timestamps</p>
</li>
<li><p>If you have two events, and it so happens that the timestamps associated with the event does not inherently mean that x happened before y. C(x) &lt; C(y) ≠&gt; x→y</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731281581983/835f28c0-670f-42d7-ad06-6bb25225038f.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-total-order">Total Order</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731282494286/2d2523fa-baf7-4807-b64d-a3ffd5ac9b0c.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731282720736/7c6f5eae-2e50-4b61-a9bb-3ebbc04a65a9.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-distributed-mutual-exclusion-lock-algorithm">Distributed Mutual Exclusion Lock Algorithm</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731283245007/946614b2-acaa-45f0-8249-ecd3d6e29425.png" alt class="image--center mx-auto" /></p>
<p>To request a lock, a process will message the rest of the processes with its local timestamp, they will take that request and put it in their local priority queue, and then acknowledges the request to its peers. When there is a tie, the priority is given to the process with the lower process id.</p>
<p>Now, how does a process know if it has access to the lock? First, it must have its request be at the top of the priority queue. Second, it must have acknowledgements from all the other nodes in the system or all the requests that it’s gotten so far are later than it’s lock request.</p>
<p>On lock release, a process will send an unlock message to its peers. When the peers receive the unlocked message, the peers remove the request from their respective queues.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731283296068/d3de1ad3-f330-40f3-b81d-e80be5c7f06c.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731283447102/e8a0c03d-2f64-45fc-9186-c967b9c3104b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-lamports-physical-clock">Lamports Physical Clock</h3>
<p>a |→ b =&gt; Ci(a) &lt; Cj(b): means a in physical time happened before b.</p>
<p>Physical clock conditions:</p>
<ol>
<li><p>PC1 (bound on individual clock drift)</p>
</li>
<li><p>PC2 (bound on mutual drift)</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731283823820/0e0c7a74-6ef7-4991-b5d0-7bc06d62ef8b.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-ipc-time-and-clock-drift">IPC Time and Clock Drift</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731285202255/faa6975c-efff-47f9-8b48-fc39e33e3645.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731285390554/cee73404-e72a-4ac6-ab22-c3cf7404bc12.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-latency-limits">Latency Limits</h2>
<h3 id="heading-latency-vs-throughput">Latency vs Throughput</h3>
<p>Latency: Elapsed Time</p>
<p>Throughput: Events per unit of time</p>
<p>Bandwidth: Measurement of Throughput</p>
<p>RPC Performance: Hardware Overhead and Software Overhead</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731286813872/b08a3aff-c8af-43c6-8779-92eedd2e5183.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-components-of-rpc-latency">Components of RPC Latency</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731350395222/d94a01c9-dd84-4026-9253-d29ced01269a.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731350589057/8fbfc7b8-a77d-4454-ad0b-b7f0fe1feae9.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-marshaling-and-data-copying">Marshaling and Data Copying</h3>
<p>The arguments in a function don’t know about the specific details of it all. There is three copies during this process, the copying is one of the biggest overhead for RPC. To reduce copies, we can marshal into kernel buffer directly. This involves dumping code directly into the kernel.</p>
<p>Or, we can use a shared descriptor between the client stub and kernel. This can help inform the kernel to provide info to the kernel about the layout of arguments on the stack.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731351283767/64fde42c-7d77-4d93-8812-c4250ff9f932.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-control-transfer-1">Control Transfer</h3>
<p>There is potentially 4 context switches in control. The client will send a call out, the server will take the call in and switch the respective procedure. Then the server will have to switch processes and send a result output call, where the client will take the result in, and be regiven the control.</p>
<p>The critical paths in the path of latency is when the call comes in for both the server and client. We can reduce the context switches down to 2 by overlapping the non-critical paths with network communication.</p>
<p>We can reduce it to 1 by spinning instead of switching on the client side.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731353038763/c6d66800-f73d-4d2c-b2b1-77146a97c828.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-protocol-processing">Protocol Processing</h3>
<p>What transport for RPC? LAN is reliable =&gt; reduce latency</p>
<p>Choices for reducing latency in transport, in the case we assume the LAN is reliable.</p>
<ul>
<li><p>No low level acknowledgements. The semantics of RPC involves having a return, which can serve as a form of acknowledgement.</p>
</li>
<li><p>Hardware checksum for packet integrity. Instead of using software checksum, we can make protocol processing leaner by using only hardware checksum.</p>
</li>
<li><p>No Client side buffering since client blocked. Since the client is blocked, we don’t need to buffer, we can just resend the call when necessary.</p>
</li>
<li><p>Overlap service side buffering with result transmission.</p>
</li>
</ul>
<h2 id="heading-active-networks">Active Networks</h2>
<h3 id="heading-routing-on-the-internet">Routing on the Internet</h3>
<p>Each router makes a decision to hop between routers through each . They are forward packets at each decision. By making them active, which instead of using a table lookup, it makes the hop dynamic and carries code with them. This allows for customized service to happen during network flow. How and who can write this code to virtualize network behavior by making it active?</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731354013441/f3ceed00-9166-4b88-a29f-5332ae3b78c4.png" alt class="image--center mx-auto" /></p>
<p>The OS must provide quality of service APIs to synthesize code that can be part of the payload. Changing the protocol stack of every node in the universe is very difficult, though.</p>
<h3 id="heading-ants-toolkit">ANTS Toolkit</h3>
<p>The ANTS Toolkit is an application level package, which adds a ANTS header onto the payload. The new packet that is generated will be traversed on the network, and if a regular node that cannot process the ANTS header exists, it will treat it normally, however, if a node that is more sophisticated processes it, the ANTS header can be utilized. Keep the active nodes only at the edges of the network, which allows the core IP network to be unchanged.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731354254857/315c7a7b-d74f-4da5-a1eb-240737d0f3cb.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-ants-capsule-api">ANTS Capsule + API</h3>
<p>The header is split into the version, type, prev, header. The capsule itself does not contain the code, but it contains a type field which can identify the code needed.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731354421459/8070e1af-2c9c-4d84-8247-93fdedad9153.png" alt class="image--center mx-auto" /></p>
<p>When the node receives the capsule, it will use the previous node field from the capsule to send a request to the previous node and get the capsule code to send to the current node. The capsule will calculate the fingerprint using the code, and compare that against the type field to ensure its correct. After the capsule has gotten the code, it will save it in soft store for future use. If the capsule cannot find the code, it simply drops the capsule.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731354689426/163a9bb4-a800-475b-aac3-03f8ce6a7dad.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-pros-and-cons-of-active-networks">Pros and Cons of Active Networks</h3>
<p>Pro</p>
<ul>
<li>Flexibility from App Perspective</li>
</ul>
<p>Cons</p>
<ul>
<li><p>Protection Threats</p>
<ul>
<li><p>ANTS Runtime Safety =&gt; Java Sandboxing</p>
</li>
<li><p>Code Spoofing =&gt; Robust Fingerprint</p>
</li>
<li><p>Soft State Integrity =&gt; Restricted API</p>
</li>
</ul>
</li>
<li><p>Resource Management Threats</p>
<ul>
<li><p>At Each Node =&gt; Restricted API</p>
</li>
<li><p>Flooding the Network =&gt; Internet Already Susceptible</p>
</li>
</ul>
</li>
</ul>
<p>Some roadblocks with the active networks vision is we need router vendors to buy into it. ANTS Software routing cannot match throughput needed in Internet Core. Routers traditionally just do table lookups quickly and move onto the next thing, using more intelligent software routing can lead to slower processes.</p>
<h3 id="heading-feasibility">Feasibility</h3>
<p>Router Makers loath opening up the network =&gt; Only feasible at the edge of the network.</p>
<p>Software Routing cannot match hardware routing =&gt; Only feasible at the edge of the network.</p>
<p>Social + Psychological reasons =&gt; Hard for user community to accept arbitrary code executing in the public routing fabric.</p>
<p>There isn’t a killer app that utilizes active networks when it was first introduced, it seemed like a solution for a problem that didn’t really exist. Software defined networking has given active networks a new lease of life. Cloud computing has done much for virtualizing networks, like how active networks approach the problem.</p>
<h2 id="heading-systems-from-components">Systems from Components</h2>
<h3 id="heading-big-picture-1">Big Picture</h3>
<p>In the specification stage in the design cycle, we can use IOA, which has C-like syntax and composition operator. In the code stage, we can use OCAML, which is object oriented, efficient code similar to C, and a nice complement to IOA. Then NUPRL is used to optimize the OCAML code. It takes in as input unoptimized OCAML code and outputs optimized OCAML code. The output is verified to be functionally equivalent.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731355443520/e393fcf5-f01c-4d97-97fa-2dceb6871a89.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-digging-deeper">Digging Deeper</h3>
<p>The abstract behavioral spec using IOA can prove the properties facillitated by IOA. Then we can refine it to get into a concrete behavioral spec. Then we can use Ocaml to implement the concrete behavioral spec. We are taking many components together and meshing them together, which OCaml is good at doing.</p>
<p>There is no guarantee that the unoptimized OCaml implementation is the same as IOA specification. The properties are proven, but the implementation may not be faithful to the abstract behavioral spec.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731355707484/06388699-cfbc-4bec-9a29-572760c5121e.png" alt class="image--center mx-auto" /></p>
<p>By using a theorem prover framework, you can convert OCaml code that is unoptimized to unoptimized NuPrl code. Then you can use the theorem prover framework to optimize the NuPrl code and it’s proven to be functionally equivalent. Afterwards, the NuPrl code will be converted back to OCaml, optimized.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731355839327/13b91a1b-9db3-419b-b780-3c95b8abe2e1.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-putting-the-methodology-to-work">Putting the Methodology to Work</h3>
<p>Start with IOA Spec, going from abstract and conrete. How do we synthesize the stack from the concrete spec?</p>
<p>Getting to an unoptimized OCaml Implementation:</p>
<ul>
<li><p>Ensemble suite of microprotocols</p>
<ul>
<li>Flow Control, sliding window, encryption, scatter/gather, etc.</li>
</ul>
</li>
<li><p>Well Defined interfaces allowing composition</p>
</li>
<li><p>Facilitates component based design</p>
</li>
</ul>
<p>Recall original goal: Mimic VLSI design</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1731356005430/7d560363-1b07-4204-9261-cbe312cd6809.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-how-to-optimize-the-protocol-stack">How to Optimize the Protocol Stack</h3>
<p>Layering could lead to inefficiencies</p>
<ul>
<li>Analogy to VLSI component based design breaks down for software</li>
</ul>
<p>Interfaces in software have more inefficiencies, like copying</p>
<p>Several Sources of Optimization Possible</p>
<ul>
<li><p>Explicit memory management instead of implicit GC</p>
</li>
<li><p>Avoid Marshaling / Unmarshaling across layers</p>
</li>
<li><p>Buffering in parallel with transmission</p>
</li>
<li><p>Header Compression</p>
</li>
<li><p>Locality Enhancement for common code sequences</p>
</li>
</ul>
<p>How do we automate this process?</p>
<h3 id="heading-nuprl-to-the-rescue">NuPrl to the Rescue</h3>
<p>Go layer by layer in the protocol stack and check to see if the optimization theorems can optimize the code. The Theorem proving framework can do things by collapsing the layers. It can generate bypass code if CCP is satisfied. CCP stands for common case predicate. We can do much simpler processing if the CCP is satisfied. Otherwise, the more manual approach must be used. However, because its layer by layer, it’s possible to do it on a layer approach.</p>
<p>Then we convert back to OCaml. We can prove optimization and functional equivalence, as well.</p>
<h1 id="heading-distributed-objects-and-middleware">Distributed Objects and Middleware</h1>
<h2 id="heading-spring-os">Spring OS</h2>
<h3 id="heading-how-to-innovate-os">How to innovate OS?</h3>
<p>Brand new OS? or Better implementation of known OS?</p>
<p>Market place needs large complex server software =&gt; Take the Intel inside approach… in this case unix inside for box makers like sun microsystems</p>
<p>If you’re a company like sun microsystems, you want the externel interfaces to stay the same, but you innovate internally so that extensibility, flexibility, etc. can be better</p>
<h3 id="heading-object-based-vs-procedural-design">Object based vs Procedural Design</h3>
<p>Objects contain the state and the methods that manipulate said said locally to the object. You get strong isolation. Procedural design has more shared state approach. In spring, Object is used to build the OS kernel.</p>
<h3 id="heading-spring-approach">Spring Approach</h3>
<p>Build strong interfaces for each subsystem. Open, flexible, and extensible. Spring uses IDL which allows third party software vendors to build their own software that integrates with the spring subsystem. This is similar to the microkernel approach, following liedtkes principle. It only handles CPU and memory.</p>
<h3 id="heading-nucleus-microkernel-of-spring">Nucleus Microkernel of Spring</h3>
<p>The nucleus has many doors where domains create in the nucleus and they communicate with a target domain. The nucleus is what holds the threads. IF a client wants to invoke something in a target domain, they will have a door table that points to a particular door where you are able to access through the nucleus door. Each domain has a unique door table.</p>
<p>The spring kernel is composed of the nucleus and the memory manager of the domains representation in address space. The nucleus is involved in every door call and manages the permissions accordingly. Then, the thread will be allocated to the target domain to do actions, and then it is deactivated and given it back to the domain. Cross domain calls can be done very quickly by utilizing the nucleus.</p>
<h3 id="heading-object-invocation-across-the-network">Object Invocation Across the Network</h3>
<p>Network proxies are used to connect domains which may have different nucleus connected. Proxies may also have different protocols and you have flexibility in how you build your network OS. Proxies invisible to the client + server. They are unaware where the server or client is located, and they don’t care either.</p>
<p>For example: Proxy A: Export net handle embedding Door-X to Proxy B<br />Proxy B: Will use said net handle to connect the nuclei. When a client domain makes an invokation, it thinks it is directly accessing the server domain, but it may go through Door Y in Nucleus B to access Proxy B, which gets an exported net handle embedding from Proxy A which gets it from Door X which is gotten from the server domain.</p>
<h3 id="heading-secure-object-invocation">Secure Object Invocation</h3>
<p>The client domain will go through the door and then access. afront object. THe front object is what the client domain sees, and then it accesses an underlying object where the real stuff is checked. Policies can be associated with the front object.</p>
<p>ACL Checked before invocation, which is point to from the Underlying object</p>
<h3 id="heading-virtual-memory-management-in-spring">Virtual Memory Management in Spring</h3>
<p>Regions are a set of a pages. It is done in linear address space. And then memory objects map the regions to the memory objects. The memory object represents a portion of memory space to map to files, swap space, etc.</p>
<h3 id="heading-memory-object-specific-paging">Memory Object Specific Paging</h3>
<p>Memory Object is connected to a pager object, and the pager object is connected to a cache obj in a virtual memory manager. It can split these into any regions. inwants. There isn’t a single pager object required for all the cache objects. It’s possible to have a cache obj representation that is in a different address space pager object which maps to some other memory object. The pager object is responsible for coordination between any complex memory management scenario. Address Space managers are responsible for linear space, and mapped to cache object resposible by pager objects managing different regions of memory.</p>
<p>Summary</p>
<p>Object Oriented Kernel</p>
<ul>
<li><p>nucleus → threads + IPC</p>
</li>
<li><p>microkernel → nucleus + address space</p>
</li>
<li><p>Door + door table → basis for cross domain calls</p>
</li>
<li><p>Object invocation. andcross machine calls</p>
</li>
<li><p>Virtual memory management</p>
<ul>
<li>Address space object, memory object, externel pagers, cached obj</li>
</ul>
</li>
</ul>
<h3 id="heading-dynamic-client-server-relationship">Dynamic Client Server Relationship</h3>
<p>The client and the server</p>
<h3 id="heading-subcontract">Subcontract</h3>
<p>Subcontract mechanism simplifies the client server relationship. CLient side stub generation is simplified, and subcontract is responsible for details.</p>
<p>The client stub, when needing to marshal, it will contact the subcontract and then handle those details of the mechanism. The server side allows the subcontract to create, revoke, or process.</p>
<p>All of the magic happens in the subcontract system.</p>
<h2 id="heading-java-rmi">Java RMI</h2>
<h3 id="heading-java-distributed-object-model">Java Distributed Object Model</h3>
<p>Remote Object</p>
<ul>
<li>Accessible from different address spaces</li>
</ul>
<p>Remote Interface</p>
<ul>
<li>Declarations for methods in a remote object</li>
</ul>
<p>Failure Semantics</p>
<ul>
<li>Clients deal with RMI exceptions</li>
</ul>
<p>Similarities / Differences to local objects</p>
<ul>
<li><p>Object references can be params</p>
</li>
<li><p>Param only as value / result</p>
</li>
</ul>
<h3 id="heading-reuse-of-local-implementation">Reuse of Local Implementation</h3>
<p>She can reuse an existing local implementation by creating remote interface so that the implementer is in the local implementation. The implementer makes instances of objects remotely accessible =&gt; not preferable</p>
<h3 id="heading-reuse-of-remote">Reuse of Remote</h3>
<p>Using remote allows the java magic to make the bank account implementation visible publicly. Java RMI does the heavy lifting to make server object visible to network clients =&gt; preferable</p>
<h3 id="heading-rmi-implementation-rrl">RMI Implementation - RRL</h3>
<p>The Remote reference layer is responsible for unmarshaling between client stubs and server skeletons. The RRL serializes and deserializes between the two. Similar to subcontract, the RRL handles any specific details about which server, how the server is implementation, where the server is, etc.</p>
<h3 id="heading-rmi-implementation-transport">RMI Implementation - Transport</h3>
<p>Endpoint</p>
<ul>
<li><p>Protection Domain</p>
</li>
<li><p>Table of Remote Object</p>
</li>
</ul>
<p>Connection Management</p>
<ul>
<li><p>Setup, teardown, listen</p>
</li>
<li><p>liveness monitoring</p>
</li>
<li><p>Choice of export</p>
</li>
</ul>
<p>RRL decides right export to use</p>
<p>I/O onc hannel uising connections</p>
<h3 id="heading-enterprise-java-beans">Enterprise Java beans</h3>
<h3 id="heading-n-tier-applications">N-Tier Applications</h3>
<p>Each layer in a app has to worry about persistence, transactions, caching, clustering, and security.</p>
<h1 id="heading-distributed-subsystem">Distributed Subsystem</h1>
<h2 id="heading-global-memory-systems">Global Memory Systems</h2>
<h3 id="heading-context-for-gms">Context for GMS</h3>
<p>Regularly, the virtual memory manager is supposed to give the illusion that the working set is much more accessible than it actually is. The memory pressure may differ from nodes. Some may be busy and other may be idle</p>
<p>Is it possible to use memory of another process on a node in a LAN system? As technology has gotten quicker, it is possible. to quickly use cluster memory to page across the network. GMS we look at it reading across the cluster.</p>
<h3 id="heading-basics">Basics</h3>
<p>“Cache” refers to physical memory (DRAM) not processor cache</p>
<p>Sense of community to handle page faults at. anode</p>
<p>Physical memory. ofa node</p>
<p>Local ← Working set<br />Global ← Spare memory</p>
<h3 id="heading-handling-page-faults">Handling page faults</h3>
<p>Case 1:</p>
<p>Common Case:</p>
<ul>
<li><p>Page fault for X on node P</p>
</li>
<li><p>Hit in global cache of some node Q</p>
</li>
</ul>
<p>If there was. apage fault, that means. th local memory pressure begins increasing. Then it will take the oldest page and send it over to the other host. But Host Q remains unchanged, becuase it sends host P a page.</p>
<p>Case 2:</p>
<p>Common Case with memory Pressure at P:</p>
<ul>
<li><p>Page fault for. xon node p</p>
</li>
<li><p>Swap LRU page Y for X</p>
</li>
</ul>
<p>Host P will send out the oldest page to Q and get a new page from Global X from Host Q.</p>
<p>Case 3:</p>
<p>Faulting Page on disk:</p>
<ul>
<li><p>Page fault for x on node P</p>
</li>
<li><p>Page not in cluster</p>
</li>
</ul>
<p>Host P global page shrinks and local increases. Host P grabs memory from the disk. Send to host with globally oldest page, and then he can throw it on the disk or throw it out.</p>
<p>Case 4:</p>
<p>Faulting Page Actively Shared:</p>
<ul>
<li><p>Page fault for X on node P</p>
</li>
<li><p>Page in some peer node Q’s local cache</p>
</li>
</ul>
<p>Host P takes from Host Q local, Host P has to throw. a global into Host R, where it will choose a victim to throw onto the disk.</p>
<h3 id="heading-behavior-of-algorithm">Behavior of Algorithm</h3>
<p>A completely idle node becomes. amemory serverf for peers on the cluster</p>
<h3 id="heading-geriatrics">Geriatrics</h3>
<p>Management work needs to be distributed</p>
<p>Epoch parameters</p>
<ul>
<li><p>T Max Duration</p>
</li>
<li><p>M max replacements</p>
</li>
</ul>
<p>At the start of each epoch</p>
<ul>
<li><p>send age into initiator</p>
</li>
<li><p>receive min age what are the oldest m pages?</p>
</li>
</ul>
<p>m pages that are old are candidates for replacement in the next epoch.</p>
<p>Action at. anode or page fault page y eviction candidate</p>
<p>Age(page y) &gt; min age =&gt; discard</p>
<p>Age(page y) &lt; min age =&gt; send to peer node</p>
<p>Think Global - Act Local!</p>
<h3 id="heading-implementation-in-unix">Implementation in Unix</h3>
<p>OSF1 is the OS used by researcher building this.</p>
<p>Modify the memory manager to use the GMS to do the work to check if its in the remote GMS or disk or whateve.r</p>
<p>GMS is directly integrated and handles the communication betwenel ocal and global stuff.</p>
]]></content:encoded></item></channel></rss>