Helpers¶
Helper modules.
These should be stand alone modules that could reasonably be their own PyPI package. This comes with two benefits:
The library is void of any business data, which makes it easier to understand.
It means that it is decoupled making it easy to reuse the code in different sections of the code. An example is the
stack_exchange_graph_data.helpers.progress
module. Which is easily used in bothstack_exchange_graph_data.helpers.curl.curl()
andstack_exchange_graph_data.driver.load_xml_stream()
. Since it wraps a stream it’s easily transferable to any Python loop, and due to lacking business logic means there’s no monkey patching.
Cache¶
Simple file cache.
Exposes two forms of cache:
A file that is downloaded from a website.
A 7z archive cache - files that are extracted from a 7z archive.
-
class
stack_exchange_graph_data.helpers.cache.
Archive7zCache
(cache_path: pathlib.Path, archive_cache: stack_exchange_graph_data.helpers.cache.CacheMethod)¶ Exposes a cache that allows unzipping 7z archives.
-
ensure
(use_cache: bool = True) → pathlib.Path¶ Ensure target file exists.
Unzips the 7z archive showing the name and size of each file being extracted.
- Parameters
use_cache – Set to false to force reunarchiving of the data.
- Returns
Location of file.
-
-
class
stack_exchange_graph_data.helpers.cache.
Cache
(cache_dir: pathlib.Path)¶ Interface to make cache instances.
-
archive_7z
(cache_path: pathlib.Path, archive_cache: stack_exchange_graph_data.helpers.cache.CacheMethod) → stack_exchange_graph_data.helpers.cache.Archive7zCache¶ Get an archive cache endpoint.
- Parameters
cache_path – Location of file relative to the cache directory.
archive_cache – A cache endpoint to get the 7z archive from.
- Returns
An archive cache endpoint.
-
file
(cache_path: str, url: str) → stack_exchange_graph_data.helpers.cache.FileCache¶ Get a file cache endpoint.
- Parameters
cache_path – Location of file relative to the cache directory.
url – URL location of the file to download from if not cached.
- Returns
A file cache endpoint.
-
-
class
stack_exchange_graph_data.helpers.cache.
CacheMethod
(cache_path: pathlib.Path)¶ Base cache object.
-
_is_cached
(use_cache: bool) → bool¶ Check if the target exist in the cache.
- Parameters
use_cache – Set to false to force redownload the data.
- Returns
True if we should use the cache.
-
ensure
(use_cache: bool = True) → pathlib.Path¶ Ensure target file exists.
This should be overwritten in child classes.
- Parameters
use_cache – Set to false to force redownload the data.
- Returns
Location of file.
-
-
class
stack_exchange_graph_data.helpers.cache.
FileCache
(cache_path: pathlib.Path, url: str)¶ Exposes a cache that allows downloading files.
-
ensure
(use_cache: bool = True) → pathlib.Path¶ Ensure target file exists.
This curls the file from the web to cache, providing a progress bar whilst downloading.
- Parameters
use_cache – Set to false to force redownload the data.
- Returns
Location of file.
-
Coroutines¶
Coroutine helpers.
A lot of this module is based on the assumption that Python doesn’t seamlessly handle the destruction of coroutines when using multiplexing or broadcasting. It also helps ease interactions when coroutines enter closed states prematurely.
-
class
stack_exchange_graph_data.helpers.coroutines.
CoroutineDelegator
¶ Helper class for delegating to coroutines.
-
_increment_coroutine_refs
() → None¶ Increment the amount of sources for the coroutines.
-
run
() → List[Iterator]¶ Send all data into the coroutine control flow.
- Returns
If a coroutine is closed prematurely the data that hasn’t been entered into the control flow will be returned. Otherwise an empty list is.
-
send_to
(source: Union[Iterator, Iterable], target: Generator) → None¶ Add a source and target to send data to.
This does not send any data into the target, to do that use the
CoroutineDelegator.run()
function.- Parameters
source – Input data, can be any iterable. Each is passed straight unaltered to target.
target – This is the coroutine the data enters into to get into the coroutine control flow.
-
-
stack_exchange_graph_data.helpers.coroutines.
_is_magic_coroutine
(target: Any) → bool¶ Check if target is a magic coroutine.
- Parameters
target – An object to check against.
- Returns
If the object is a magic coroutine.
-
stack_exchange_graph_data.helpers.coroutines.
broadcast
(*targets: Generator) → Generator¶ Broadcast items to targets.
-
stack_exchange_graph_data.helpers.coroutines.
coroutine
(function: Callable) → Callable¶ Wrap a coroutine generating function to make magic coroutines.
A magic coroutine is wrapped in a protective coroutine that eases the destruction of coroutine pipelines. This is because the coroutine is wrapped in a ‘bubble’ that:
Primes the coroutine when the first element of data is passed to it.
Sends information about the creation and destruction of other coroutines in the pipeline. This allows a coroutine to destroy itself when all providers have exited.
Handles when a coroutine is being prematurely closed, if this is the case all target coroutines will be notified that some data sources are no longer available allowing them to deallocate themselves if needed.
Handles situations where a target coroutine has been prematurely closed. In such a situation the current coroutine will be closed and exit with a StopIteration error, as if the coroutine has been closed with the
.close
.
It should be noted that these coroutine pipelines should be started via the
stack_exchange_graph_data.helpers.coroutines.CoroutineDelegator
. This is as it correctly initializes the entry coroutine, and handles when the coroutine has been prematurely closed.- Parameters
function – Standard coroutine generator function.
- Returns
Function that generates magic coroutines.
-
stack_exchange_graph_data.helpers.coroutines.
file_sink
(*args: Any, **kwargs: Any) → Generator¶ Send all data to a file.
-
stack_exchange_graph_data.helpers.coroutines.
primed_coroutine
(function: Callable[[...], Generator]) → Callable¶ Primes a coroutine at creation.
- Parameters
function – A coroutine function.
- Returns
The coroutine function wrapped to prime the coroutine at creation.
Curl¶
Copy URL.
-
stack_exchange_graph_data.helpers.curl.
curl
(path: pathlib.Path, *args: Any, **kwargs: Any) → None¶ Download file to system.
Provides a progress bar of the file being downloaded and some statistics around the file and download.
- Parameters
path – Local path to save the file to.
args&kwargs – Passed to
request.get
.
Progress¶
Display progress of a stream.
-
class
stack_exchange_graph_data.helpers.progress.
BaseProgressStream
(stream: Iterator[T], size: Optional[int], si: Callable[[int], Tuple[int, str]], progress: Callable[[T], int], width: int = 20, prefix: str = '', start: int = 0, message: Optional[str] = None)¶ Display the progress of a stream.
-
_get_progress
(current: int) → str¶ Get the progress of the stream.
- Parameters
current – Current progress - not in percentage.
- Returns
Progress bar and file size.
-
-
class
stack_exchange_graph_data.helpers.progress.
DataProgressStream
(stream: Iterator[T], size: Optional[int], width: int = 20, prefix: str = '', message: Optional[str] = None)¶ Display progress of a data stream.
-
class
stack_exchange_graph_data.helpers.progress.
ItemProgressStream
(stream: Iterator[T], size: Optional[int], width: int = 20, prefix: str = '', message: Optional[str] = None)¶ Display progress of an item stream.
SI¶
Simplify a number to a wanted base.
-
class
stack_exchange_graph_data.helpers.si.
Magnitude
¶ Magnitude conversions.
-
byte
() → Tuple[int, str]¶ Convert a number to a truncated base form.
- Parameters
value – Value to adjust.
- Returns
Truncated value and unit.
-
ibyte
() → Tuple[int, str]¶ Convert a number to a truncated base form.
- Parameters
value – Value to adjust.
- Returns
Truncated value and unit.
-
number
() → Tuple[int, str]¶ Convert a number to a truncated base form.
- Parameters
value – Value to adjust.
- Returns
Truncated value and unit.
-
-
stack_exchange_graph_data.helpers.si.
display
(values: Tuple[int, str], decimal_places: int = 2) → str¶ Display a truncated number to a wanted DP.
- Parameters
values – Value and unit to display.
decimal_places – Amount of decimal places to display the value to.
- Returns
Right aligned display value.
-
stack_exchange_graph_data.helpers.si.
si_magnitude
(base: int, suffix: str, prefixes: str) → Callable[[int], Tuple[int, str]]¶ SI base converter builder.
- Parameters
base – Base to truncate values to.
suffix – Suffix used to denote the type of information.
prefixes – Prefixes before the suffix to denote magnitude.
- Returns
A function to change a value by the above parameters.
XRef¶
Expand partial xrefs.
-
stack_exchange_graph_data.helpers.xref.
custom_parser
(prefix: str) → Type[docutils.parsers.Parser]¶ Markdown parser with partial xref support.
Extends
recommonmark.parser.CommonMarkParser
with to include thecustom_parser.PendingXRefTransform
transform.- Parameters
prefix – Http base to prepend to partial hyperlinks.
- Returns
A custom parser to parse Markdown.