Skip to content

NumPy Style API Documentation

Documentation Style

This page demonstrates API documentation using NumPy-style docstrings, which provide a structured format ideal for scientific computing and data analysis projects.

Overview

NumPy-style docstrings use a section-based format with underlined headers. They're particularly well-suited for:

  • Scientific computing libraries
  • Data analysis tools
  • Mathematical software
  • Research code

DataProcessor Class

src.docstring_examples.numpy_style.DataProcessor

DataProcessor(name: str, validation_enabled: bool = True, max_transformations: int = 100)

Comprehensive data processor with loading, transformation, and export.

This class provides a complete data processing pipeline including data loading, transformation operations, validation, and export functionality. It supports various data formats and provides extensive configuration options.

The processor maintains internal state and provides detailed logging of all operations for debugging and monitoring purposes.

Parameters

name : str Descriptive name for this processor instance validation_enabled : bool, optional Whether to enable data validation, by default True max_transformations : int, optional Maximum number of transformations allowed, by default 100

Attributes

data : dict or None Currently loaded data (None if no data loaded) transformations_applied : int Number of transformations applied to current data export_count : int Number of times data has been exported validation_enabled : bool Whether to validate data during operations max_transformations : int Maximum number of transformations allowed

Examples

Complete workflow example:

Create processor with validation enabled

processor = DataProcessor("sales_data", validation_enabled=True)

Load data from various sources

processor.load_data({"product": "Widget", "sales": 1000}) processor.load_from_file("additional_data.json")

Apply transformations

processor.transform_data( ... lambda x: x * 1.1 if isinstance(x, (int, float)) else x ... ) processor.apply_filter(lambda item: item.get("sales", 0) > 500)

Export results

processor.export_data("processed_results.json")

Notes

This processor is thread-safe for read operations but not for concurrent modifications. Use appropriate locking mechanisms if sharing across threads.

Initialize the data processor.

Parameters

name : str Descriptive name for this processor instance validation_enabled : bool, optional Whether to enable data validation, by default True max_transformations : int, optional Maximum number of transformations allowed, by default 100

Raises

ValueError If name is empty or max_transformations is negative

Examples

Basic processor

processor = DataProcessor("basic_processor")

Advanced processor with custom settings

processor = DataProcessor( ... name="advanced_processor", ... validation_enabled=True, ... max_transformations=50 ... )

Source code in src/docstring_examples/numpy_style.py
def __init__(
    self, name: str, validation_enabled: bool = True, max_transformations: int = 100
) -> None:
    """Initialize the data processor.

    Parameters
    ----------
    name : str
        Descriptive name for this processor instance
    validation_enabled : bool, optional
        Whether to enable data validation, by default True
    max_transformations : int, optional
        Maximum number of transformations allowed, by default 100

    Raises
    ------
    ValueError
        If name is empty or max_transformations is negative

    Examples
    --------
    >>> # Basic processor
    >>> processor = DataProcessor("basic_processor")
    >>>
    >>> # Advanced processor with custom settings
    >>> processor = DataProcessor(
    ...     name="advanced_processor",
    ...     validation_enabled=True,
    ...     max_transformations=50
    ... )
    """
    super().__init__(name)

    if max_transformations < 0:
        raise ValueError("max_transformations must be non-negative")

    self.data: Optional[Dict[str, Any]] = None
    self.transformations_applied = 0
    self.export_count = 0
    self.validation_enabled = validation_enabled
    self.max_transformations = max_transformations

    logger.info(
        f"DataProcessor '{name}' initialized with "
        f"validation={'on' if validation_enabled else 'off'}"
    )

Attributes

status property

status: str

Get the current processor status.

Returns

str String indicating current status ("active" or "inactive")

Functions

load_data

load_data(data: Union[Dict[str, Any], List[Dict[str, Any]]]) -> None

Load data into the processor.

This method accepts data in dictionary or list format and stores it internally for subsequent processing operations. The data is validated if validation is enabled.

Parameters

data : dict or list of dict Data to load - either a single dictionary or list of dictionaries

Raises

ProcessingError If data validation fails or processor is inactive TypeError If data is not in expected format

Examples
Load single record

processor.load_data({"id": 1, "name": "Alice", "score": 95})

Load multiple records

processor.load_data([ ... {"id": 1, "name": "Alice", "score": 95}, ... {"id": 2, "name": "Bob", "score": 87} ... ])

Source code in src/docstring_examples/numpy_style.py
def load_data(self, data: Union[Dict[str, Any], List[Dict[str, Any]]]) -> None:
    """Load data into the processor.

    This method accepts data in dictionary or list format and stores it
    internally for subsequent processing operations. The data is validated
    if validation is enabled.

    Parameters
    ----------
    data : dict or list of dict
        Data to load - either a single dictionary or list of dictionaries

    Raises
    ------
    ProcessingError
        If data validation fails or processor is inactive
    TypeError
        If data is not in expected format

    Examples
    --------
    >>> # Load single record
    >>> processor.load_data({"id": 1, "name": "Alice", "score": 95})
    >>>
    >>> # Load multiple records
    >>> processor.load_data([
    ...     {"id": 1, "name": "Alice", "score": 95},
    ...     {"id": 2, "name": "Bob", "score": 87}
    ... ])
    """
    if not self.is_active:
        raise ProcessingError("Cannot load data: processor is inactive")

    if not isinstance(data, (dict, list)):
        raise TypeError("Data must be a dictionary or list of dictionaries")

    if self.validation_enabled:
        self._validate_data(data)

    if isinstance(data, dict):
        self.data = {"records": [data]}
    else:
        self.data = {"records": data}

    logger.info(
        f"Loaded {len(self.data['records'])} record(s) into processor '{self.name}'"
    )

load_from_file

load_from_file(file_path: Union[str, Path]) -> None

Load data from a JSON file.

Reads data from the specified file path and loads it into the processor. Supports both string paths and Path objects.

Parameters

file_path : str or pathlib.Path Path to the JSON file to load

Raises

ProcessingError If file cannot be read or contains invalid JSON FileNotFoundError If the specified file does not exist PermissionError If insufficient permissions to read the file

Examples
Load from string path

processor.load_from_file("data/input.json")

Load from Path object

from pathlib import Path processor.load_from_file(Path("data") / "input.json")

Source code in src/docstring_examples/numpy_style.py
def load_from_file(self, file_path: Union[str, Path]) -> None:
    """Load data from a JSON file.

    Reads data from the specified file path and loads it into the processor.
    Supports both string paths and Path objects.

    Parameters
    ----------
    file_path : str or pathlib.Path
        Path to the JSON file to load

    Raises
    ------
    ProcessingError
        If file cannot be read or contains invalid JSON
    FileNotFoundError
        If the specified file does not exist
    PermissionError
        If insufficient permissions to read the file

    Examples
    --------
    >>> # Load from string path
    >>> processor.load_from_file("data/input.json")
    >>>
    >>> # Load from Path object
    >>> from pathlib import Path
    >>> processor.load_from_file(Path("data") / "input.json")
    """
    if not self.is_active:
        raise ProcessingError("Cannot load from file: processor is inactive")

    file_path = Path(file_path)

    try:
        with file_path.open("r", encoding="utf-8") as f:
            data = json.load(f)
        self.load_data(data)
        logger.info(f"Successfully loaded data from {file_path}")

    except FileNotFoundError:
        raise ProcessingError(f"File not found: {file_path}")
    except json.JSONDecodeError as e:
        raise ProcessingError(f"Invalid JSON in file {file_path}: {e}")
    except Exception as e:
        raise ProcessingError(
            f"Error loading file {file_path}: {e}", original_error=e
        )

transform_data

transform_data(transformation_func: Callable[[Any], Any]) -> ProcessingResult

Apply a transformation function to all data values.

Applies the provided transformation function to each value in the loaded data. The transformation preserves the data structure while modifying individual values.

Parameters

transformation_func : callable Function to apply to each data value. Should accept any value and return the transformed value.

Returns

dict Dictionary containing transformation results with keys:

- 'records_processed' : int
    Number of records processed
- 'transformations_applied' : int
    Total transformations applied to this dataset
- 'success' : bool
    Whether the transformation completed successfully
Raises

ProcessingError If no data is loaded, processor is inactive, or max transformations exceeded ValueError If transformation_func is not callable

Examples
Convert all strings to uppercase

result = processor.transform_data( ... lambda x: x.upper() if isinstance(x, str) else x ... )

Apply mathematical transformation to numbers

result = processor.transform_data( ... lambda x: x * 1.1 if isinstance(x, (int, float)) else x ... )

Complex transformation with type checking

def complex_transform(value): ... if isinstance(value, str): ... return value.strip().title() ... elif isinstance(value, (int, float)): ... return round(value * 1.05, 2) ... return value result = processor.transform_data(complex_transform)

Source code in src/docstring_examples/numpy_style.py
def transform_data(
    self, transformation_func: Callable[[Any], Any]
) -> ProcessingResult:
    """Apply a transformation function to all data values.

    Applies the provided transformation function to each value in the loaded data.
    The transformation preserves the data structure while modifying
    individual values.

    Parameters
    ----------
    transformation_func : callable
        Function to apply to each data value. Should accept any value and
        return the transformed value.

    Returns
    -------
    dict
        Dictionary containing transformation results with keys:

        - 'records_processed' : int
            Number of records processed
        - 'transformations_applied' : int
            Total transformations applied to this dataset
        - 'success' : bool
            Whether the transformation completed successfully

    Raises
    ------
    ProcessingError
        If no data is loaded, processor is inactive, or max transformations exceeded
    ValueError
        If transformation_func is not callable

    Examples
    --------
    >>> # Convert all strings to uppercase
    >>> result = processor.transform_data(
    ...     lambda x: x.upper() if isinstance(x, str) else x
    ... )
    >>>
    >>> # Apply mathematical transformation to numbers
    >>> result = processor.transform_data(
    ...     lambda x: x * 1.1 if isinstance(x, (int, float)) else x
    ... )
    >>>
    >>> # Complex transformation with type checking
    >>> def complex_transform(value):
    ...     if isinstance(value, str):
    ...         return value.strip().title()
    ...     elif isinstance(value, (int, float)):
    ...         return round(value * 1.05, 2)
    ...     return value
    >>> result = processor.transform_data(complex_transform)
    """
    if not self.is_active:
        raise ProcessingError("Cannot transform data: processor is inactive")

    if self.data is None:
        raise ProcessingError("No data loaded for transformation")

    if not callable(transformation_func):
        raise ValueError("transformation_func must be callable")

    if self.transformations_applied >= self.max_transformations:
        raise ProcessingError(
            f"Maximum transformations ({self.max_transformations}) exceeded"
        )

    try:
        records_processed = 0
        for record in self.data["records"]:
            for key, value in record.items():
                record[key] = transformation_func(value)
            records_processed += 1

        self.transformations_applied += 1

        result = {
            "records_processed": records_processed,
            "transformations_applied": self.transformations_applied,
            "success": True,
        }

        logger.info(
            f"Applied transformation to {records_processed} records in "
            f"processor '{self.name}'"
        )
        return result

    except Exception as e:
        raise ProcessingError(f"Transformation failed: {e}", original_error=e)

apply_filter

apply_filter(filter_func: Callable[[Dict[str, Any]], bool]) -> ProcessingResult

Filter data records based on a predicate function.

Removes records that don't match the filter criteria. The filter function should return True for records to keep and False for records to remove.

Parameters

filter_func : callable Predicate function that accepts a record dictionary and returns True to keep the record, False to remove it

Returns

dict Dictionary containing filter results with keys:

- 'records_before' : int
    Number of records before filtering
- 'records_after' : int
    Number of records after filtering
- 'records_removed' : int
    Number of records removed
- 'success' : bool
    Whether the filter operation completed successfully
Raises

ProcessingError If no data is loaded or processor is inactive ValueError If filter_func is not callable

Examples
Keep only records with score > 80

result = processor.apply_filter(lambda record: record.get('score', 0) > 80)

Keep records with specific status

result = processor.apply_filter( ... lambda record: record.get('status') == 'active' ... )

Complex filter with multiple conditions

def complex_filter(record): ... return (record.get('score', 0) > 70 and ... record.get('active', False) and ... len(record.get('name', '')) > 0) result = processor.apply_filter(complex_filter)

Source code in src/docstring_examples/numpy_style.py
def apply_filter(
    self, filter_func: Callable[[Dict[str, Any]], bool]
) -> ProcessingResult:
    """Filter data records based on a predicate function.

    Removes records that don't match the filter criteria. The filter function
    should return True for records to keep and False for records to remove.

    Parameters
    ----------
    filter_func : callable
        Predicate function that accepts a record dictionary and returns
        True to keep the record, False to remove it

    Returns
    -------
    dict
        Dictionary containing filter results with keys:

        - 'records_before' : int
            Number of records before filtering
        - 'records_after' : int
            Number of records after filtering
        - 'records_removed' : int
            Number of records removed
        - 'success' : bool
            Whether the filter operation completed successfully

    Raises
    ------
    ProcessingError
        If no data is loaded or processor is inactive
    ValueError
        If filter_func is not callable

    Examples
    --------
    >>> # Keep only records with score > 80
    >>> result = processor.apply_filter(lambda record: record.get('score', 0) > 80)
    >>>
    >>> # Keep records with specific status
    >>> result = processor.apply_filter(
    ...     lambda record: record.get('status') == 'active'
    ... )
    >>>
    >>> # Complex filter with multiple conditions
    >>> def complex_filter(record):
    ...     return (record.get('score', 0) > 70 and
    ...            record.get('active', False) and
    ...            len(record.get('name', '')) > 0)
    >>> result = processor.apply_filter(complex_filter)
    """
    if not self.is_active:
        raise ProcessingError("Cannot apply filter: processor is inactive")

    if self.data is None:
        raise ProcessingError("No data loaded for filtering")

    if not callable(filter_func):
        raise ValueError("filter_func must be callable")

    try:
        records_before = len(self.data["records"])

        filtered_records = []
        for record in self.data["records"]:
            if filter_func(record):
                filtered_records.append(record)

        self.data["records"] = filtered_records
        records_after = len(filtered_records)
        records_removed = records_before - records_after

        result = {
            "records_before": records_before,
            "records_after": records_after,
            "records_removed": records_removed,
            "success": True,
        }

        logger.info(
            f"Filter applied: {records_removed} records removed, "
            f"{records_after} remaining"
        )
        return result

    except Exception as e:
        raise ProcessingError(f"Filter operation failed: {e}", original_error=e)

export_data

export_data(file_path: Union[str, Path], format: str = 'json') -> None

Export processed data to a file.

Saves the current processed data to the specified file path in the requested format. Currently supports JSON format with plans for additional formats in future versions.

Parameters

file_path : str or pathlib.Path Output file path for the exported data format : str, optional Export format ("json" currently supported), by default "json"

Raises

ProcessingError If no data to export, processor inactive, or export fails ValueError If format is not supported PermissionError If insufficient permissions to write to the file

Examples
Basic JSON export

processor.export_data("output.json")

Export with explicit format

processor.export_data("output.json", format="json")

Export to Path object

from pathlib import Path output_path = Path("exports") / "processed_data.json" processor.export_data(output_path)

Source code in src/docstring_examples/numpy_style.py
def export_data(self, file_path: Union[str, Path], format: str = "json") -> None:
    """Export processed data to a file.

    Saves the current processed data to the specified file path in the
    requested format. Currently supports JSON format with plans for
    additional formats in future versions.

    Parameters
    ----------
    file_path : str or pathlib.Path
        Output file path for the exported data
    format : str, optional
        Export format ("json" currently supported), by default "json"

    Raises
    ------
    ProcessingError
        If no data to export, processor inactive, or export fails
    ValueError
        If format is not supported
    PermissionError
        If insufficient permissions to write to the file

    Examples
    --------
    >>> # Basic JSON export
    >>> processor.export_data("output.json")
    >>>
    >>> # Export with explicit format
    >>> processor.export_data("output.json", format="json")
    >>>
    >>> # Export to Path object
    >>> from pathlib import Path
    >>> output_path = Path("exports") / "processed_data.json"
    >>> processor.export_data(output_path)
    """
    if not self.is_active:
        raise ProcessingError("Cannot export data: processor is inactive")

    if self.data is None:
        raise ProcessingError("No data to export")

    if format.lower() != "json":
        raise ValueError(f"Unsupported export format: {format}")

    file_path = Path(file_path)

    try:
        # Ensure parent directory exists
        file_path.parent.mkdir(parents=True, exist_ok=True)

        with file_path.open("w", encoding="utf-8") as f:
            json.dump(self.data, f, indent=2, ensure_ascii=False)

        self.export_count += 1
        logger.info(f"Exported data to {file_path} (export #{self.export_count})")

    except PermissionError:
        raise ProcessingError(f"Permission denied writing to {file_path}")
    except Exception as e:
        raise ProcessingError(f"Export failed: {e}", original_error=e)

get_statistics

get_statistics() -> Dict[str, Any]

Get comprehensive statistics about the processor and its data.

Returns detailed information about the current state of the processor, including data counts, transformation history, and processing metrics.

Returns

dict Dictionary containing statistics with keys:

- 'processor_name' : str
    Name of this processor instance
- 'processor_status' : str
    Current status (active/inactive)
- 'data_loaded' : bool
    Whether data is currently loaded
- 'record_count' : int
    Number of records currently loaded
- 'transformations_applied' : int
    Number of transformations applied
- 'export_count' : int
    Number of times data has been exported
- 'validation_enabled' : bool
    Whether validation is enabled
- 'created_at' : str
    When the processor was created (ISO format)
- 'uptime_seconds' : float
    How long the processor has existed in seconds
Examples

stats = processor.get_statistics() print(f"Processor: {stats['processor_name']}") print(f"Records: {stats['record_count']}") print(f"Transformations: {stats['transformations_applied']}")

Source code in src/docstring_examples/numpy_style.py
def get_statistics(self) -> Dict[str, Any]:
    """Get comprehensive statistics about the processor and its data.

    Returns detailed information about the current state of the processor,
    including data counts, transformation history, and processing metrics.

    Returns
    -------
    dict
        Dictionary containing statistics with keys:

        - 'processor_name' : str
            Name of this processor instance
        - 'processor_status' : str
            Current status (active/inactive)
        - 'data_loaded' : bool
            Whether data is currently loaded
        - 'record_count' : int
            Number of records currently loaded
        - 'transformations_applied' : int
            Number of transformations applied
        - 'export_count' : int
            Number of times data has been exported
        - 'validation_enabled' : bool
            Whether validation is enabled
        - 'created_at' : str
            When the processor was created (ISO format)
        - 'uptime_seconds' : float
            How long the processor has existed in seconds

    Examples
    --------
    >>> stats = processor.get_statistics()
    >>> print(f"Processor: {stats['processor_name']}")
    >>> print(f"Records: {stats['record_count']}")
    >>> print(f"Transformations: {stats['transformations_applied']}")
    """
    current_time = datetime.now()
    uptime = (current_time - self.created_at).total_seconds()

    stats = {
        "processor_name": self.name,
        "processor_status": self.status,
        "data_loaded": self.data is not None,
        "record_count": len(self.data["records"]) if self.data else 0,
        "transformations_applied": self.transformations_applied,
        "export_count": self.export_count,
        "validation_enabled": self.validation_enabled,
        "created_at": self.created_at.isoformat(),
        "uptime_seconds": round(uptime, 2),
    }

    return stats

process

process(data: Any) -> Any

Process data using the internal pipeline.

Implementation of the abstract process method from BaseProcessor. This method provides a simplified interface for basic data processing.

Parameters

data : Any Data to process

Returns

Any Processed data

Raises

ProcessingError If processing fails

Source code in src/docstring_examples/numpy_style.py
def process(self, data: Any) -> Any:
    """Process data using the internal pipeline.

    Implementation of the abstract process method from BaseProcessor.
    This method provides a simplified interface for basic data processing.

    Parameters
    ----------
    data : Any
        Data to process

    Returns
    -------
    Any
        Processed data

    Raises
    ------
    ProcessingError
        If processing fails
    """
    try:
        self.load_data(data)
        return self.data
    except Exception as e:
        raise ProcessingError(f"Processing failed: {e}", original_error=e)

deactivate

deactivate() -> None

Deactivate the processor.

Once deactivated, the processor should not perform any operations until reactivated.

Notes

This method logs the deactivation event for monitoring purposes.

Source code in src/docstring_examples/numpy_style.py
def deactivate(self) -> None:
    """Deactivate the processor.

    Once deactivated, the processor should not perform any operations
    until reactivated.

    Notes
    -----
    This method logs the deactivation event for monitoring purposes.
    """
    self.is_active = False
    logger.info(f"Processor '{self.name}' deactivated")

ProcessingError Exception

src.docstring_examples.numpy_style.ProcessingError

ProcessingError(message: str, error_code: Optional[str] = None, original_error: Optional[Exception] = None)

Bases: Exception

Custom exception for data processing errors.

This exception is raised when data processing operations fail due to invalid data, configuration errors, or runtime issues.

Parameters

message : str Error message describing the failure error_code : str, optional Optional error code for categorization original_error : Exception, optional Original exception that caused this error

Attributes

message : str Error message describing the failure error_code : str or None Optional error code for categorization original_error : Exception or None Original exception that caused this error

Initialize ProcessingError.

Parameters

message : str Descriptive error message error_code : str, optional Optional categorization code original_error : Exception, optional The original exception if this is a wrapper

Module-Level Functions

Functions Coming Soon

Module-level function documentation will be added when the source code is available.

Example Usage

from docstring_examples.numpy_style import DataProcessor

# Create a processor instance
processor = DataProcessor(
    name="experiment_analysis",
    validation_enabled=True,
    max_transformations=15
)

# Load and process scientific data
processor.load_data(experiment_data)
processor.transform_data(normalize_measurements)
processor.apply_filter(optimal_conditions_filter)

# Export results with metadata
processor.export_data('results.json')

Style Benefits

Structure

  • Clear section headers
  • Organized parameter lists
  • Dedicated sections for each aspect

Scientific Focus

  • Ideal for complex mathematical descriptions
  • Supports detailed parameter documentation
  • Great for algorithm documentation

Best Practices

  • Use section headers consistently
  • Document all parameters thoroughly
  • Include mathematical notation where relevant
  • Provide comprehensive examples