Fortran Formatter Bug: Result Clause Removed & Intent Errors

by Viktoria Ivanova 61 views

Hey guys! Let's dive into a tricky issue we've spotted in our Fortran formatting tool. It's all about how we handle function result clauses and intent attributes. This might sound a bit technical, but stick with me – it's super important for making sure our code works right!

Bug Description

So, the main problem we're seeing is that when the formatter processes functions that use result clauses, it's dropping the result(name) part from the function's signature. That's not cool! On top of that, it's slapping intent(in) attributes on all the parameters, even when that's not what they should be. Let's break this down a bit more, shall we?

Deep Dive into the Bug

When we talk about function result clauses, we're referring to the part of a Fortran function definition that looks like this: function name(...) result(var). This nifty feature allows you to specify a different name for the return variable of the function. It's super handy for keeping things clear and organized, especially when you're dealing with complex functions. The issue here is that our formatter is just chopping this part off, which is a major bummer for anyone relying on this feature.

Now, let's talk about intent attributes. In Fortran, intent attributes (intent(in), intent(out), intent(inout)) tell the compiler how a function is going to use its arguments. This is crucial for both optimization and error checking. The intent(in) attribute means the argument is only read by the function, intent(out) means the argument is only written to, and intent(inout) means the argument is both read and written. Our bug is automatically adding intent(in) to every parameter, which is often incorrect and can lead to some serious head-scratching when your code doesn't behave as expected.

Expected Behavior

Okay, so what should be happening? Ideally, our formatter should preserve the function result clauses exactly as they are written. We want to see function name(...) result(var) remain untouched. Also, parameters should not automatically get intent(in) attributes. Instead, the formatter should respect whatever intent is explicitly specified in the code, or if none is specified, leave it as is. Basically, we want the formatter to be a helpful tool, not a meddling one!

The Ideal Scenario

Think of it this way: the formatter's job is to make the code look pretty and consistent, not to change its underlying behavior. So, when it comes to function signatures, we expect the formatter to keep the result(var) clause intact. Parameters should only receive an intent attribute if it's already there in the original code. If a parameter is intended to be modified by the function (i.e., it should have intent(out) or intent(inout)), the formatter shouldn't be forcing it to be intent(in). This ensures that the code's logic remains unchanged after formatting.

Actual Behavior

Unfortunately, what's actually happening is a bit of a mess. The result clauses are being stripped clean off the function signatures, which, as we've discussed, is a big no-no. And yep, you guessed it, all parameters are getting the intent(in) treatment, regardless of their true purpose. But wait, there's more! The original parameter declarations are also being duplicated in the function body. It's like a declaration party in there, but nobody invited clarity.

The Nitty-Gritty of the Problem

Let's break down the actual misbehavior step by step. First, the missing result clause is a straight-up removal of crucial information, making the function signature incomplete and potentially breaking code that relies on the result variable. Second, the incorrect intent attributes are a recipe for disaster. Imagine a function that's supposed to modify an input argument; if it's incorrectly marked as intent(in), the compiler might throw a fit, or worse, the code might produce incorrect results without any warnings. Third, the declaration duplication is just adding unnecessary clutter and confusion. It makes the code harder to read and maintain. It's like having two copies of the same instruction manual, and neither one is quite right.

Minimal Reproducible Example

To show you exactly what we're dealing with, here's a little chunk of Fortran code that triggers the bug:

program test
 use fortfront, only: transform_lazy_fortran_string_with_format, format_options_t
 implicit none
 
 character(len=*), parameter :: input = &
 "program test" // new_line('a') // &
 "contains" // new_line('a') // &
 "function calc(x, y) result(res)" // new_line('a') // &
 "real :: x, y, res" // new_line('a') // &
 "res = x + y" // new_line('a') // &
 "end function calc" // new_line('a') // &
 "end program test"
 
 character(len=:), allocatable :: output, error_msg
 type(format_options_t) :: options
 
 call transform_lazy_fortran_string_with_format(input, output, error_msg, options)
 print *, output
end program

This is a simple program with a function calc that adds two numbers and returns the result using a result clause. When we run this through our formatter, we get some pretty wonky output.

Input

Here’s the original code fed into the formatter:

program test
contains
function calc(x, y) result(res)
real :: x, y, res
res = x + y
end function calc
end program test

This is clean, straightforward Fortran code. It defines a program test that contains a function calc. The function takes two arguments, x and y, and returns their sum. The result(res) clause specifies that the return value should be assigned to a variable named res. Nothing too fancy, but it demonstrates the use of a result clause perfectly.

Actual Output

And here's the mess our formatter spits out:

program test
 implicit none
contains
 function calc(x, y)
 implicit none
 real(8), intent(in) :: x, y
 real :: x
 real :: y
 real :: res
 res = x + y
 end function calc
end program test

Notice anything missing? Yep, the result(res) clause is gone! And look at those parameters – they've all been tagged with intent(in), and we've got duplicated declarations. It's like the formatter went rogue and decided to rewrite our code in its own quirky way.

The Key Issues in the Output

  1. Missing result(res): The result clause is completely stripped from the function signature, which changes the function's interface and can lead to unexpected behavior. It's like ordering a pizza with all the toppings and getting just the crust.
  2. intent(in) overload: Both x and y are incorrectly given the intent(in) attribute. This means the function is supposed to only read these variables, but if the original code intended to modify them, this is a problem. It's like telling someone they can look but can't touch, even though they need to move things around.
  3. Declaration duplication: The variables x and y are declared twice, once with real(8), intent(in) and again with just real. This is redundant and confusing. It's like writing the same phone number twice in your contacts, but one is missing a digit.
  4. Type inconsistency: The real(8) declaration mixed with real can cause issues depending on compiler settings and implicit typing rules. It's like mixing metric and imperial units in a recipe – things might not turn out quite right.

Expected Output

This is what we should be seeing:

program test
 implicit none
contains
 function calc(x, y) result(res)
 implicit none
 real(8) :: x, y, res
 res = x + y
 end function calc
end program test

Clean, simple, and true to the original intent. The result(res) clause is there, the parameters are declared once with their correct types, and no sneaky intent(in) attributes have been added.

The Importance of Accurate Formatting

The expected output is crucial because it preserves the original meaning and intent of the code. The function signature remains intact, the parameter declarations are accurate, and the code behaves as expected. This is what we want from a formatter – to make the code look good without changing its functionality.

Issues Identified

Let's recap the main problems we've uncovered:

  1. Missing result clause: The result(res) part is completely removed from the function signature. It's like the formatter has a vendetta against result clauses.
  2. Incorrect intent attributes: Parameters are getting intent(in) slapped on them when they shouldn't (especially problematic for inout or unspecified intent). It's like the formatter is assuming everyone's a read-only kind of programmer.
  3. Declaration duplication: Parameters are being declared twice – once with intent(in) and again without. It's like the formatter is trying to cover its bases, but just making things more confusing.
  4. Type inconsistency: We're seeing mixed real(8) and real declarations for the same variables. It's like the formatter can't decide what kind of numbers we're dealing with.

Impact

This bug is a big deal, guys. Here's why:

  • High: It breaks function interfaces that rely on result clauses. If you're using result clauses, this formatter is going to mess things up.
  • Correctness: Incorrect intent attributes can cause compilation errors or, even worse, subtle bugs that are hard to track down.
  • Code Quality: Duplicate declarations create confusing code that's harder to read and maintain.
  • Compatibility: Result clauses are important for modern Fortran practices, so this bug makes the formatter incompatible with a lot of modern code.

Why This Matters

This isn't just about making code look pretty; it's about ensuring that the code works correctly. When a formatter changes the fundamental structure of a function, it can introduce serious problems. Incorrect intent attributes can lead to runtime errors that are difficult to debug. Duplicate declarations can confuse both the compiler and the human reader. And stripping out result clauses breaks a key feature of modern Fortran.

Use Cases Affected

This bug is going to hit a lot of different scenarios:

  • Functions returning different types than the function name would suggest.
  • Functions with complex return value handling.
  • Code following modern Fortran best practices with explicit result variables.
  • Generic programming with result clauses.

Real-World Scenarios

Imagine you're working on a scientific simulation that uses functions to calculate complex physical quantities. Many of these functions might use result clauses to clearly specify the return variable. If the formatter strips out these clauses, the simulation could produce incorrect results. Or, consider a library that uses generic programming techniques with result clauses. This bug could make the library unusable with the formatter.

Environment

We've seen this bug when using the formatter via fluff (https://github.com/krystophny/fluff). It affects all functions with result clauses and is consistently reproducible. So, it's not a one-off glitch; it's a systemic issue.

Workaround

Unfortunately, there's currently no workaround for preserving result clauses. If you're using result clauses, you'll need to avoid using the formatter on those parts of your code, which is less than ideal.

The Need for a Fix

The lack of a workaround highlights the urgency of this issue. Users who rely on result clauses are essentially blocked from using the formatter. This limits the tool's usefulness and can be frustrating for developers who want to maintain consistent code formatting across their projects.

Labels

We're tagging this as:

  • bug
  • high
  • functions
  • parsing
  • result-clause

This helps us prioritize and categorize the issue so we can get it fixed ASAP!

Let's squash this bug and make our Fortran formatting tool awesome again!