Table of Contents
1 Threads vs processes
From a stackoverflow thread:
- Threads run in a shared memory space
Processes run in seperate memory space
So threads has shared state, which is both a good and bad thing, since it allows for communcation between threads but also leads to race conditions.
Processes circumvent the issue of race conditions by having separated states, but this also leads to lack of communcation.
1.1 Interprocess Communication / IPC approaches
Method | Short description | Provided by (OS or other environments |
---|---|---|
File | Record stored on disk, or record synthesized on demand from file server, which can be accessed by multiple processes. | Most OSes |
Signal | System message sent from one process to another, not usually used to transfer data but instead to remotely command the partnered process. | Most OSes |
Socket | Data stream sent over a network interface, either to a different process on the same computer or another computer. Typically byte-oriented, sockets rarely preserve message boundaries. Data written through sockets requrie formatting to preserve message boundaries. | Most OSes |
Message queue | Data stream similar to socket, but which can usually preserve message boundaries. Typically implemented by the OS, they allow multiple processes to read and write to the message queue without being directly connected to each other. | Most OSes |
Pipe | Unidirectional data channel. Data written to the write end of the pipe is buffered by the operating system until it is read from the read end of the pipe. Two-way data streams between processes can be achieved by creating two pipes utilizing the STDIN and STDOUT . |
All POSIX systems, Windows |
Named pipe | Pipe implemented through a file on the file system instead of STDIN and STDOUT . Multiple processes can read and write to the file as a buffer for the IPC data. |
All POSIX systems, Windows, AmigaOS 2.0+ |
Semaphore | Simple structure that synchronizes multiple processes acting on shared resources. | All POSIX systems, Windows, AmigaOS |
Shared memory | Multiple processes are given acces to the same block of memory which creates a shared buffer for the processes to communicate to each other. | All POSIX systems, Windows |
Message passing | Allows multiple programs to communicate using message queues and/or non-OS managed channels, commonly used in the concurrency model. | Used in RPC, RMI, and MPI paradigms, Java RMI, CORBA, DDS, MSQM, MailSlots, QNX, others |
Memory-mapped file | File mapped to RAM and can be modified by changing the memory addresses directly instead of outputting to a stream. This shares the same benefits as a standard file. | All POSIX systems, Windows |
2 Concurrency vs Parallelism
Copied from Appendix A:
- Concurrent computing
- form of computing in which several computations are executed during overlapping time periods - concurrently.
- Parallel computing
- computation in which many calculations or the execution of processes are carried out simultaneously.
Though the terms are very closely related, one can have concurrency without parallelism; multitasking by time-sharing (sharing of computing resources) on a single-core CPU. And parallelism without concurrency; bit-level parallelism.
3 Concurrency
3.1 Futures vs Promises
Futures and promises describe an object that acts as a proxy for a result that is initially unknown, usually because the computation of its value is yet not complete.
Though a future is often used interchangely with a promise, when distinguished a future is a read-only placeholder view of a variable, while a promise is a writable, single-assignment container which sets the value of the future.
- A future may be defined without specifying which promise will set its value
- Different possible promises may set the value of a given future, though this can only be done once for a given future
- A future and a promise can be created together/associated with each other, then:
- The future is the value
- The promise is the function that sets the value
- Essentially, the return value (future) of an asynchronous function (promise)
- Setting the value of a future is called resolving, fulfilling or binding it.
4 Programming Paradigms
4.1 Functional programming
- No side affects; there's no mutable state.
4.2 Reactive programming
- Changes propagate throughout a system automatically
- Imagine three cells A, B, and C. A is defined as the sum of B and C. Whenever B or C changes, A reacts to update itself.
4.3 Functional Reactive Programming
- Model user input as a function that changes over time, abstracting away the idea of mutable state.
4.3.1 Signals
A signal sends values over time.
4.4 Rx
Single | Multiple | |
Sync | T getData() |
Iterable<T> getData() |
Async | Future<T> getData() |
Observable<T> getData() |
Iterable | Observable |
---|---|
pull | push |
T next() |
onNext() |
throws Exception |
onError(Exception) |
returns; |
onCompleted() |
4.4.1 Map
5 Multithreading
6 Python
6.1 Multiprocessing
Python uses pickling
of objects for IPC → a lot of overhead
for large amounts of data transmitted between processes.
6.2 Threading
Potential for race conditions + GIL keeps you from using multiple cores.
6.3 Global Interpreter Lock / GIL
Python threads are real system threads:
- POSIX threads (pthreads)
- Windows threads
And are fully managed by the host operating system.
The Global Interpreter Lock, short GIL, ensures that only one thread' runs in the interpreter at once.
With the GIL we get a sort of "cooperative multitasking", where one thread executes, holding the GIL, until it reaches something that doesn't require the CPU, e.g. IO, then releases the GIL, allowing another thread to start execution.
Thus, for tasks where there is a lot of "down time"/IO for a thread, like a webserver, Python isn't too bad!
7 Appendix A - Glossary
7.1 General computing glossary
- Application Programming Interface / API
- set of subroutine definitions, protocols and tools for building software and applications.
- DSL
- Domain Specific Language, is what you think it is; a language created for a specific domain/task.
- Transducer
- is a composable algorithmic transformation. It is independent of its
input and output sources and specify only the essence of the
transformation in terms of an individual element. Basically, a
transducer has the signature
(whatever, input -> whatever) -> (whatever, input -> whatever)
So it's a mapping of a function to a function with the signatures above, and they're composable! - Message boundary
- separation between two messages being sent over a protocol.
- POSIX
- Portable Operating System Interface is a family of standards specified by the IEEE Computer Society for mainting compatibility between OSes. It defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of Unix and other OSes.
- Virtual Memory
- mapping of memory addressses used by a program, called virtual addresses, into physical addresses in computer memory. Main storage as seen by the process or task appears as contiguous address space or collection of contiguous segments.
- Monkey patching
- way for a program to extend or modify supporting system software locally (affecting only the running instance of the program.
- Proxy
- object functioning as an interface to something else.
- Stack
- specialized buffer which stores data from the top down.
- Stack pointer
- small register that stores the address of the last program request in the stack.
- WSGI
- Web Server Interface Gateway, is a specification for a simple and universal interface between web servers and web applications or frameworks for Python.
7.2 Concurrency and parallelism glossary
- Concurrent computing
- form of computing in which several computations are executed during overlapping time periods - concurrently.
- Concurrent system
- system where computation can advance without waiting for all other computations to complete; where more than one computation can advance at the same time.
- Coroutines
- general control structure where flow control is cooperatively passed
between two different routines without returning.
A good example is
yield
in Python, which creates a coroutine. Whenyield
is encountered the current state of the function is saved and the control is returned to the calling function. The calling function can then transfer control back to the yielding function by calling it again, and its state will be returned to the point where theyield
was encountered and execution continues. See this example. - Future
- See Futures vs Promises.
- Interprocess Communication / IPC
- mechanism an OS provides to allow processes it manages to share data. Typically, applicaitions can use IPC categorized as clients and servers, where the client requests data and the server responds to client requests.
- Mutex
- mutual exclusion, is a program object that allows multiple threads to share the same resource, such as file access, but not simultaneously. Its purpose is to prevent race conditions.
- Parallel computing
- computation in which many calculations or the execution of processes are carried out simultaneously.
- Promise
- See 3.1.
- Race condition
- behavior where the output is dependent on the sequence or timing of other uncontrollable events.
- Time-sharing
- sharing of a computing resource among many users at the same time.
- Thread locals / thread-local data
- global object which is thread-safe and thread-specific.
7.3 Software design
- Proxy
- object functioning as an interface to something else.
8 Appendix B - Examples
8.1 Python's yield
def test(): yield 1 yield 2 # test() returns an iterator t = test() print(next(t)) print(next(t))