Table of Contents

1 Threads vs processes

From a stackoverflow thread:

  • Threads run in a shared memory space
  • Processes run in seperate memory space

    So threads has shared state, which is both a good and bad thing, since it allows for communcation between threads but also leads to race conditions.

    Processes circumvent the issue of race conditions by having separated states, but this also leads to lack of communcation.

1.1 Interprocess Communication / IPC approaches

Method Short description Provided by (OS or other environments
File Record stored on disk, or record synthesized on demand from file server, which can be accessed by multiple processes. Most OSes
Signal System message sent from one process to another, not usually used to transfer data but instead to remotely command the partnered process. Most OSes
Socket Data stream sent over a network interface, either to a different process on the same computer or another computer. Typically byte-oriented, sockets rarely preserve message boundaries. Data written through sockets requrie formatting to preserve message boundaries. Most OSes
Message queue Data stream similar to socket, but which can usually preserve message boundaries. Typically implemented by the OS, they allow multiple processes to read and write to the message queue without being directly connected to each other. Most OSes
Pipe Unidirectional data channel. Data written to the write end of the pipe is buffered by the operating system until it is read from the read end of the pipe. Two-way data streams between processes can be achieved by creating two pipes utilizing the STDIN and STDOUT. All POSIX systems, Windows
Named pipe Pipe implemented through a file on the file system instead of STDIN and STDOUT. Multiple processes can read and write to the file as a buffer for the IPC data. All POSIX systems, Windows, AmigaOS 2.0+
Semaphore Simple structure that synchronizes multiple processes acting on shared resources. All POSIX systems, Windows, AmigaOS
Shared memory Multiple processes are given acces to the same block of memory which creates a shared buffer for the processes to communicate to each other. All POSIX systems, Windows
Message passing Allows multiple programs to communicate using message queues and/or non-OS managed channels, commonly used in the concurrency model. Used in RPC, RMI, and MPI paradigms, Java RMI, CORBA, DDS, MSQM, MailSlots, QNX, others
Memory-mapped file File mapped to RAM and can be modified by changing the memory addresses directly instead of outputting to a stream. This shares the same benefits as a standard file. All POSIX systems, Windows

2 Concurrency vs Parallelism

Copied from Appendix A:

Concurrent computing
form of computing in which several computations are executed during overlapping time periods - concurrently.
Parallel computing
computation in which many calculations or the execution of processes are carried out simultaneously.

Though the terms are very closely related, one can have concurrency without parallelism; multitasking by time-sharing (sharing of computing resources) on a single-core CPU. And parallelism without concurrency; bit-level parallelism.

3 Concurrency

3.1 Futures vs Promises

Futures and promises describe an object that acts as a proxy for a result that is initially unknown, usually because the computation of its value is yet not complete.

Though a future is often used interchangely with a promise, when distinguished a future is a read-only placeholder view of a variable, while a promise is a writable, single-assignment container which sets the value of the future.

  • A future may be defined without specifying which promise will set its value
  • Different possible promises may set the value of a given future, though this can only be done once for a given future
  • A future and a promise can be created together/associated with each other, then:
    • The future is the value
    • The promise is the function that sets the value
    • Essentially, the return value (future) of an asynchronous function (promise)
    • Setting the value of a future is called resolving, fulfilling or binding it.

4 Programming Paradigms

4.1 Functional programming

  • No side affects; there's no mutable state.

4.2 Reactive programming

  • Changes propagate throughout a system automatically
    • Imagine three cells A, B, and C. A is defined as the sum of B and C. Whenever B or C changes, A reacts to update itself.

4.3 Functional Reactive Programming

  • Model user input as a function that changes over time, abstracting away the idea of mutable state.

4.3.1 Signals

A signal sends values over time.

4.4 Rx

  Single Multiple
Sync T getData() Iterable<T> getData()
Async Future<T> getData() Observable<T> getData()
Iterable Observable
pull push
T next() onNext()
throws Exception onError(Exception)
returns; onCompleted()

4.4.1 Map


5 Multithreading

6 Python

6.1 Multiprocessing

Python uses pickling of objects for IPC → a lot of overhead for large amounts of data transmitted between processes.

6.2 Threading

Potential for race conditions + GIL keeps you from using multiple cores.

6.3 Global Interpreter Lock / GIL

Python threads are real system threads:

  • POSIX threads (pthreads)
  • Windows threads

And are fully managed by the host operating system.

The Global Interpreter Lock, short GIL, ensures that only one thread' runs in the interpreter at once.

With the GIL we get a sort of "cooperative multitasking", where one thread executes, holding the GIL, until it reaches something that doesn't require the CPU, e.g. IO, then releases the GIL, allowing another thread to start execution.

Thus, for tasks where there is a lot of "down time"/IO for a thread, like a webserver, Python isn't too bad!

7 Appendix A - Glossary

7.1 General computing glossary

Application Programming Interface / API
set of subroutine definitions, protocols and tools for building software and applications.
Domain Specific Language, is what you think it is; a language created for a specific domain/task.
is a composable algorithmic transformation. It is independent of its input and output sources and specify only the essence of the transformation in terms of an individual element. Basically, a transducer has the signature (whatever, input -> whatever) -> (whatever, input -> whatever) So it's a mapping of a function to a function with the signatures above, and they're composable!
Message boundary
separation between two messages being sent over a protocol.
Portable Operating System Interface is a family of standards specified by the IEEE Computer Society for mainting compatibility between OSes. It defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of Unix and other OSes.
Virtual Memory
mapping of memory addressses used by a program, called virtual addresses, into physical addresses in computer memory. Main storage as seen by the process or task appears as contiguous address space or collection of contiguous segments.
Monkey patching
way for a program to extend or modify supporting system software locally (affecting only the running instance of the program.
object functioning as an interface to something else.
specialized buffer which stores data from the top down.
Stack pointer
small register that stores the address of the last program request in the stack.
Web Server Interface Gateway, is a specification for a simple and universal interface between web servers and web applications or frameworks for Python.

7.2 Concurrency and parallelism glossary

Concurrent computing
form of computing in which several computations are executed during overlapping time periods - concurrently.
Concurrent system
system where computation can advance without waiting for all other computations to complete; where more than one computation can advance at the same time.
general control structure where flow control is cooperatively passed between two different routines without returning. A good example is yield in Python, which creates a coroutine. When yield is encountered the current state of the function is saved and the control is returned to the calling function. The calling function can then transfer control back to the yielding function by calling it again, and its state will be returned to the point where the yield was encountered and execution continues. See this example.
See Futures vs Promises.
Interprocess Communication / IPC
mechanism an OS provides to allow processes it manages to share data. Typically, applicaitions can use IPC categorized as clients and servers, where the client requests data and the server responds to client requests.
mutual exclusion, is a program object that allows multiple threads to share the same resource, such as file access, but not simultaneously. Its purpose is to prevent race conditions.
Parallel computing
computation in which many calculations or the execution of processes are carried out simultaneously.
See 3.1.
Race condition
behavior where the output is dependent on the sequence or timing of other uncontrollable events.
sharing of a computing resource among many users at the same time.
Thread locals / thread-local data
global object which is thread-safe and thread-specific.

7.3 Software design

object functioning as an interface to something else.

8 Appendix B - Examples

8.1 Python's yield

def test():
    yield 1
    yield 2

# test() returns an iterator
t = test()