EDSL Discussion Adding NumPy 2.0 Support For Enhanced Compatibility

by Viktoria Ivanova 68 views

Hey everyone!

We need to talk about something super important for the future of EDSL: NumPy 2.0 support. Right now, EDSL is capped at NumPy versions less than 2.0, and this is starting to cause some major headaches. Let's dive into why this is a problem, what we can do about it, and how it impacts you.

Description

Currently, EDSL restricts NumPy to versions less than 2.0. This limitation is becoming a real issue, especially as Python 3.13 and other modern environments increasingly rely on NumPy 2.0 and above. To keep EDSL relevant and compatible, we need to address this.

Current Constraint

If you peek into EDSL's pyproject.toml file, specifically line 21, you'll see this:

numpy = "^1.22"

This line uses Poetry's constraint system, which translates to "greater than or equal to 1.22, but less than 2.0".

Checking the published PyPI package confirms this constraint:

$ curl -s https://pypi.org/pypi/edsl/json | jq '.info.requires_dist[] | select(contains("numpy"))'
"numpy<2.0,>=1.22"

Why is this a problem? Let's break it down.

This constraint means that EDSL can only use NumPy versions starting from 1.22 up to (but not including) 2.0. While this might seem like a minor detail, it has significant implications for compatibility and usability. Let’s explore the impacts further.

Impact

The restriction on NumPy versions has several critical impacts. These issues affect not only the usability of EDSL but also its compatibility with the broader Python ecosystem. Let's look at the main issues:

  1. Python 3.13 Compatibility: The last 1.x release of NumPy, version 1.26.4, offers limited support for Python 3.13. This can lead to build failures, which are often indicated by errors like the following:

    subprocess.CalledProcessError: Command '['make']' returned non-zero exit status 2.
    

    When you encounter such errors, it means that the older version of NumPy is struggling with the newer Python environment, causing the build process to fail. This is a significant barrier for users who want to leverage the latest Python features while using EDSL.

  2. Ecosystem Compatibility: Many packages within the scientific Python ecosystem are actively transitioning to NumPy 2.0 and later. This transition creates dependency conflicts for projects aiming to integrate EDSL alongside these modern packages. For example, if a project requires the latest version of a scientific library that depends on NumPy 2.0 features, it cannot simultaneously use EDSL due to the version constraint. This forces developers to make difficult choices between using EDSL and benefiting from the latest advancements in other scientific Python libraries.

    The scientific Python ecosystem is vast, and many libraries are interlinked. When a core library like NumPy undergoes a major version update, it sets off a chain reaction. Libraries that depend on NumPy must adapt to the new features and APIs. By lagging in NumPy support, EDSL risks becoming isolated from the rest of the ecosystem.

    Furthermore, as more libraries adopt NumPy 2.0, the pressure on EDSL users to upgrade will only increase. Developers don't want to be stuck using outdated versions of libraries, as this can mean missing out on performance improvements, bug fixes, and new features.

    Imagine a scenario where a researcher wants to use EDSL along with a cutting-edge machine learning library. If that library requires NumPy 2.0, the researcher is forced to choose between using EDSL and staying current with their machine learning tools. This kind of dilemma is not only frustrating but also hinders innovation.

    To avoid these issues, it’s crucial for EDSL to maintain compatibility with the latest versions of NumPy. This will ensure that EDSL can seamlessly integrate with other tools in the Python scientific stack, enhancing its usability and appeal.

In essence, the NumPy version constraint is like building a fence around EDSL, preventing it from interacting with the modern Python world. To ensure EDSL remains a valuable tool, we need to tear down that fence and embrace the latest technologies. Let’s explore how we can do this.

Proposed Solution

Okay, so we know we have a problem. What's the solution? There are a couple of ways we can tackle this:

  • Option 1: Loosen the Constraint

    The simplest approach is to update the NumPy constraint in pyproject.toml to allow both NumPy 1.x and 2.x versions. We can do this by changing the line to:

numpy = ">=1.22"


    This change means EDSL will work with any NumPy version greater than or equal to 1.22. *It's the most flexible solution* but assumes EDSL is fully compatible with NumPy 2.0.

    This solution directly addresses the core problem by removing the upper bound on the NumPy version. By setting the constraint to `numpy >= 1.22`, we tell the package manager (like Poetry or pip) that EDSL is compatible with any NumPy version from 1.22 onwards. This includes NumPy 2.0 and any future versions.

    The immediate benefit of this approach is that it resolves the Python 3.13 compatibility issue. Users can install EDSL in Python 3.13 environments without encountering the build failures caused by older NumPy versions.

    Moreover, it eliminates the dependency conflicts that arise when trying to use EDSL with other libraries that require NumPy 2.0. Developers can freely combine EDSL with the latest scientific Python packages, unlocking new possibilities for research and development.

    However, this approach comes with a caveat. It assumes that EDSL is fully compatible with NumPy 2.0. If there are any compatibility issues, such as changes in NumPy's API or behavior that affect EDSL's functionality, this solution could introduce new problems. Therefore, it’s crucial to thoroughly test EDSL with NumPy 2.0 to ensure everything works as expected.

    In practice, this means running EDSL’s test suite against NumPy 2.0 and carefully examining the results. Any failures or unexpected behavior should be investigated and addressed before releasing the updated constraint. This might involve modifying EDSL’s code to adapt to changes in NumPy 2.0 or implementing workarounds for specific issues. *The key is to ensure a seamless transition for users*.

*   **Option 2: Be More Specific (for now)**

    If we know there are specific compatibility issues with NumPy 2.0, we can at least support the latest 1.x versions that play nicer with Python 3.13. This would look like:

    ```toml
numpy = ">=1.26.4,<2.0"
This is a more cautious approach. It allows us to support Python 3.13 while giving us time to address any NumPy 2.0 compatibility problems. *It buys us some time*, but we'll eventually need to tackle full 2.0 support.

This strategy provides a middle ground. It acknowledges the importance of supporting newer Python versions like 3.13 while recognizing the potential challenges of a full-scale transition to NumPy 2.0.

By specifying the constraint `numpy >= 1.26.4, < 2.0`, we ensure that EDSL uses the latest 1.x version of NumPy (which has better Python 3.13 support) but avoids the 2.0 series altogether. This approach minimizes the risk of encountering compatibility issues while still addressing the most pressing problem: the inability to use EDSL in Python 3.13 environments.

The advantage here is that it gives us time to thoroughly investigate and resolve any potential issues with NumPy 2.0 before making a full commitment. We can run extensive tests, examine the code for areas that might be affected by NumPy 2.0 changes, and implement necessary modifications or workarounds. *This phased approach reduces the likelihood of introducing bugs or regressions*.

However, it’s important to view this solution as a temporary measure. Ultimately, EDSL needs to support NumPy 2.0 to remain relevant and competitive. The longer we delay the transition, the more likely we are to fall behind the rest of the ecosystem. Therefore, while this approach buys us time, it also carries a responsibility to actively work towards full NumPy 2.0 compatibility.

Additional Context

To give you a clearer picture of the landscape:

  • NumPy 2.0 was released in June 2024.
  • Most major scientific Python packages now support NumPy 2.0. This shows how widespread the adoption is becoming.
  • Python 3.13 was released in October 2024 and benefits greatly from NumPy 2.0+.

Call to Action: Let's Make It Happen!

So, what do you guys think? Would the maintainers be open to a PR that adds NumPy 2.0 support? I'm personally happy to help test compatibility and submit a PR if needed. Let's work together to keep EDSL awesome!

This change isn't just about keeping up with the Joneses; it's about ensuring EDSL remains a viable tool for everyone. By embracing NumPy 2.0, we're paving the way for a more robust, compatible, and future-proof EDSL.

To put it simply, upgrading to NumPy 2.0 support is like giving EDSL a new engine. It unlocks better performance, smoother integration with other tools, and the ability to tackle more complex tasks. It's an investment in the future of the project and the satisfaction of its users. So, let's get the ball rolling and make this happen!