Add Groebner basis related functions for fmpz_mpoly_vec by Jake-Moss · Pull Request #191 · flintlib/python-flint · GitHub
Skip to content

Add Groebner basis related functions for fmpz_mpoly_vec#191

Merged
oscarbenjamin merged 6 commits intoflintlib:masterfrom
Jake-Moss:master
Aug 29, 2024
Merged

Add Groebner basis related functions for fmpz_mpoly_vec#191
oscarbenjamin merged 6 commits intoflintlib:masterfrom
Jake-Moss:master

Conversation

@Jake-Moss
Copy link
Copy Markdown
Contributor

@Jake-Moss Jake-Moss commented Aug 19, 2024

Actual tests (only doctests present) are missing currently only because it is late here, I'll write some up tomorrow.

I'm not entirely confident with Groebner bases, so I've lifted the Groebner, auto-reduction, and BuckBerger examples from Wikipedia, and used similar polynomials in the S-poly and reduction_primitive_part examples, though they may not be motivating.

Comment thread src/flint/types/fmpz_mpoly.pyx
@oscarbenjamin
Copy link
Copy Markdown
Collaborator

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

Thanks for such a detailed reply! Funny you say this because this is exactly what I'm currently working on measuring but with profiling and more benchmarks over (some of) the PHCpack database of polynomial systems. I'm making use of the functions added in this PR and thought best to just upstream them.
I've run into the exact same computational limits and hope now that mpoly support is merged the work I plan on doing in sympy/sympy#26793 is able to alleviate it.

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

Have you tried anything else besides SymPy and python-flint in your benchmarks?

I would be interested to know how these things compare for asymptotic scaling like if there are any systems that can go a lot higher than SymPy can. Note that a 20x speedup in this area might sound like a lot but it also potentially doesn't by you much like maybe you can go from n=7 to n=8 but also maybe you still can't because that actually needs a 1000x speedup or something.

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

Have you tried anything else besides SymPy and python-flint in your benchmarks?

Not on these benchmarks in particular. I compared sympys dmp_python, python-flint, and Mathematica a while back with gcd and factoring on a rather pathological polynomial (5 generators, 18k terms, 6mb in text form) and both python-flint and Mathematica beat sympy by about 280x (~100ms compared to ~29s). I've attempted to get access to Maple but my university denied my application for a license.

I would be interested to know how these things compare for asymptotic scaling like if there are any systems that can go a lot higher than SymPy can.

As would I, I'm not aware of any personally

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

I compared sympys dmp_python, python-flint, and Mathematica a while back with gcd and factoring on a rather pathological polynomial (5 generators, 18k terms, 6mb in text form) and both python-flint and Mathematica beat sympy by about 280x (~100ms compared to ~29s)

That doesn't surprise me. It is possible that Mathematica actually uses Flint for this. You should probably expect a good algorithm well implemented in pure Python in SymPy to be at least 20x slower than Flint anyway but also different algorithms make a big difference in different cases. It is probably not difficult to make some basic improvements to this in SymPy but the main thing is that you need a few different algorithms and good heuristics for choosing between them. I am sure that there are cases where the difference is a lot worse than 280x slower.

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

Sage wraps a bunch of different libraries. I assume it has several different implementations and algorithms for computing Groebner bases although I don't know how you can choose between them. The docs claim that Singular is mostly used and that

Singular is widely regarded as the best open-source system for Groebner basis calculation in multivariate polynomial rings over fields

https://doc.sagemath.org/html/en/reference/polynomial_rings/sage/rings/polynomial/multi_polynomial_ideal.html

@GiacomoPope
Copy link
Copy Markdown
Contributor

msolve is another library which is supported by sagemath, which was added mroe recently: https://doc.sagemath.org/html/en/reference/polynomial_rings/sage/rings/polynomial/msolve.html

I think the "winner" of CAS for these computations is Magma, which is very sad as it's closed source so all the magic is essentially hidden behind licenses.

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

I've just turned the doc examples and the example shown here into tests as well as add equality comparison.

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

I'll mark this as ready for review, CI should pass

@Jake-Moss Jake-Moss marked this pull request as ready for review August 28, 2024 05:55
@Jake-Moss
Copy link
Copy Markdown
Contributor Author

Singular is widely regarded as the best open-source system for Groebner basis calculation in multivariate polynomial rings over fields

msolve is another library which is supported by sagemath, which was added mroe recently: https://doc.sagemath.org/html/en/reference/polynomial_rings/sage/rings/polynomial/msolve.html

I think the "winner" of CAS for these computations is Magma, which is very sad as it's closed source so all the magic is essentially hidden behind licenses.

Thank you both, I will check those out.

Comment thread src/flint/types/fmpz_mpoly.pyx Outdated
Comment thread src/flint/types/fmpz_mpoly.pyx Outdated
Comment thread src/flint/types/fmpz_mpoly.pyx
@oscarbenjamin
Copy link
Copy Markdown
Collaborator

I left a few comments but we can just merge if you prefer.

@GiacomoPope
Copy link
Copy Markdown
Contributor

As a general point I find these 120 length lines hard to read.

As a total aside, would you be interested in adding some linting to the CI? I think something like https://docs.astral.sh/ruff/ (ruff) has been designed to work with cython (indeed, they measure performance against the cython repository). If we decided on some config then we could quite easily automatically format with ruff format ...

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

As a total aside, would you be interested in adding some linting to the CI?

That's fine by me if others are happy with it. I haven't used ruff much and didn't realise that it works on Cython.

If you run ruff now does it completely transform the entire codebase?

Perhaps a first step would be to use ruff's checking functionality rather than the auto-formatting part. Does it reveal any errors?

It might need some configuration to limit things like reformatting auto-generated files.

@GiacomoPope
Copy link
Copy Markdown
Contributor

I just checked and it indeed finds some small things. I have a few minutes now, let me run ruff check on the code we have currently and go from there. I haven't bothered checking ruff format but I imagine it'll change a lot.

@GiacomoPope
Copy link
Copy Markdown
Contributor

For formatting, maybe cython still isn't supported, but at least the linter is

astral-sh/ruff#10250

For cython formatting, I don't know what's best, but there must be some kind of solution...

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

For cython formatting, I don't know what's best, but there must be some kind of solution...

There might not be. Cython is not such a mainstream language as Python is. Some basic linting would be good though.

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

For cython formatting, I don't know what's best, but there must be some kind of solution...

I don't believe there to be. Theres cython-lint which can catch common errors, unused variables/imports, and line lengths but it can't format. It's what I use and have a small config in pyproject.toml that I just don't commit

[tool.cython-lint]
max-line-length = 120

But it can do a bit more that just that.

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

It's what I use and have a small config in pyproject.toml that I just don't commit

It sounds like we're all on board with using a linter then so maybe we should just commit it.

Or would it throw up the whole codebase?

@Jake-Moss
Copy link
Copy Markdown
Contributor Author

Or would it throw up the whole codebase?

It does indeed, I get ~1200 lines of warnings from

cython-lint --files **/*.{pyx,pxd}

with this config

[tool.cython-lint]
max-line-length = 120
ignore = ['E225', 'E231']

most of which are line length warnings (~800).

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

I think we at least want this in the config:

exclude = 'src/flint/flintlib/.*'

Those files are all automatically generated and have long lines. It is not worth trying to format the auto-generated code nicely I think.

That brings the number of complaints down to about 500 already.

Of those E501 (line too long) is only 35 lines. Reducing the line length limit to 88 gives 480 lines too long though.

There are 73 E701 which is multiple statements on one line like:

            res = fmpq_cmp(s.val, (<fmpq>t).val)
            if op == 0: res = res < 0
            elif op == 1: res = res <= 0
            elif op == 4: res = res > 0
            elif op == 5: res = res >= 0
            else: assert False
            return res

I prefer to see this split onto multiple lines myself but don't have strong feelings about it.

There are 57 W605 (invalid escape sequence). This is something like:

        """
        The constant `\sqrt{\pi}`.
        """

That is a valid complaint: it should be a raw string for embedded LaTeX.

Many of the complaints are about whitespace and indentation which I think are mostly reasonable although I haven't looked through what it wants for indentation.

There are also a lot of complaints about things being defined or imported but not being used anywhere which definitely seem like reasonable complaints to me.

The only fixer it has is apparently changing single quote to double quotes which seems like the least useful kind of fixer. It should be trivially possible to implement fixers for things like too many blank lines or whitespace at end of line etc which are the more tedious things to fix manually. If a linter is going to complain about trivial things then you want to be able to fix them trivially.

Maybe we can just add something:

#!/usr/bin/env python
# bin/strip_whitespace.py

from pathlib import Path

def clean(filename):
    with open(filename) as fin:
        lines = fin.readlines()
    lines = [line.rstrip() + '\n' for line in lines]
    while lines and lines[-1] == "\n":
        lines.pop()
    with open(filename, 'w') as fout:
        fout.writelines(lines)

src_dir = Path('src')

for root, dirs, files in src_dir.walk():
    for file in files:
        path = root / file
        if path.suffix in ('.py', '.pyx', '.pxd'):
            clean(path)

That script removes 50 of the complaints from current code.

@oscarbenjamin
Copy link
Copy Markdown
Collaborator

Okay, looks good to me. Thanks!

@oscarbenjamin oscarbenjamin merged commit 67e3997 into flintlib:master Aug 29, 2024
@Jake-Moss
Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants