What is correct INDEX() behaviour in array context?

Discuss the spreadsheet application
Post Reply
Jim DeLaHunt
Posts: 4
Joined: Tue Oct 14, 2014 4:31 am

What is correct INDEX() behaviour in array context?

Post by Jim DeLaHunt »

I tried to use the INDEX() function in an array formula context, and I got unexpected results when I used a zero parameter to get it to return a full row or column.

How to reproduce:
  • Open a blank spreadsheet
  • In an arbitrary cell, enter this formula:

    Code: Select all

    =INDEX({1;3;5|7;9;10};{2|1};0;1)
  • Press Command-Shift-Enter to make this formula an array formula.
  • The array formula fills three columns by two rows of cells.
Observed behaviour:

Cell contents are

Code: Select all

{7; #VALUE!; #VALUE! | 1; #VALUE!; #VALUE}
.

Expected behaviour:

Cell contents are

Code: Select all

{7; 9; 10 | 1; 3; 5}
.

Discussion:

The "column" parameter of the INDEX() function is 0. This means that, outside of an array context, the INDEX function returns all columns of the selected row of an array.

The "row" parameter of the function is

Code: Select all

{2|1}
. I expect that in the array context, each row of this vector would be applied to the array.

If the formula is modified to have only a scalar value for the "row" parameter, as in

Code: Select all

=INDEX({1;3;5|7;9;10};2;0;1)
, as an array formula, then the result array is one row by three columns, with the value

Code: Select all

{7; 9; 10}
, as expected.

If the formula is modified to have a nonzero value for the "column" parameter, as in

Code: Select all

=INDEX({1;3;5|7;9;10};{2|1};1;1)
, as an array formula, then the result array is two rows by one columns, with the value

Code: Select all

{7|1}
, as expected.

If the formula is modified to delete the "column" and "range" parameters, the observed behaviour is unchanged.

Observed only on OO.o 4.0.1 on Mac OS X 10.6.8. I can't run OO.o 4.1.x on OS X 10.6.8, so I haven't tried it on OO.o 4.1.

Thus I think that this is an interaction between the array formula context, and the array parameter for "row" where a scalar is expected, and the 0 value for the parameter "column" asking for an array rather than a scalar result.

One possible response is, this computation model for INDEX() in an array context calls for this. Explain the computation model, and I'll be happy to write it up for the documentation at https://wiki.openoffice.org/wiki/Docume ... X_function .

This behaviour is described in Issue 125743. (The bug report was closed without considering its merits, which seems like a surprising way to handle a bug report, but that's a separate matter.)

So, what is the correct behaviour when INDEX() is asked to return a full row in an array formula context? Where is this specified?

Thanks in advance for any clarifications you can give.
--Jim DeLaHunt http://jdlh.com (OpenOffice 4.0.1 on Mac OS X 10.6.8)
User avatar
Lupp
Volunteer
Posts: 3757
Joined: Sat May 31, 2014 7:05 pm
Location: München, Germany

Re: What is correct INDEX() behaviour in array context?

Post by Lupp »

I rarely use "inline constant arrays" and therefore I will not pretend being your partner in this discussion.
I just can tell you where to find the mandatory specification. It is http://docs.oasis-open.org/office/v1.2/ ... -part2.pdf (or the odt version of the document). I'm afraid the behaviour you are interested in may not be specified at all (or left to the implementation). Cf to Chapter 3.3 of the above mentioned specification. It should be conclusive.
Good luck trying to understand it without doubts left!

(The INDEX() function is specified under 6.14.6 but there you won't find a specific clue.)
On Windows 10: LibreOffice 25.8.4 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: What is correct INDEX() behaviour in array context?

Post by acknak »

I think the problem with the bug report is that there is no clearly correct behavior. If you find a situation where Calc adds 2 and 3 to get 6, then obviously there is a problem, and the correct behavior is clear as well. I'm not seeing where that's the case here. Calc's behavior may not make sense but there's no way to say that it's wrong.

Maybe if you had a set of Excel sheets that depended on some other result you could make a case that (at least) some people expect INDEX to work differently.
AOO4/LO5 • Linux • Fedora 23
User avatar
MrProgrammer
Moderator
Posts: 5442
Joined: Fri Jun 04, 2010 7:57 pm
Location: Wisconsin, USA

Re: What is correct INDEX() behaviour in array context?

Post by MrProgrammer »

Hi, and welcome to the forum. Thank you for the nice example to test with.

=INDEX({1;3;5|7;9;10};2;0;1) works because the third argument is expanded to {1;2;3} and the scalar second and fourth arguments are expanded to match the third argument giving =INDEX({1;3;5|7;9;10};{2;2;2};{1;2;3};{1;1;1}). The dimensions of arguments 2, 3, and 4 are all now 1×3 and so the result is 1×3.

=INDEX({1;3;5|7;9;10};{2|1};1;1) works because the third and fourth scalar arguments are expanded to =INDEX({1;3;5|7;9;10};{2|1};{1|1};{1|1}). The dimensions of arguments 2, 3, and 4 are all now 2×1 and so the result is 2×1.

I believe =INDEX({1;3;5|7;9;10};{2|1};0;1) would be expanded to =INDEX({1;3;5|7;9;10};{2|1};{1;2;3|1;2;3};{1;1;1|1;1;1}). This fails because {2|1} has only one column and can’t supply indices for columns 2 and 3 of the result. That’s why you get #VALUE! in those columns. I realize that you would like your {2|1} to be expanded to {2;2;2|1;1;1} but that doesn’t happen because {2|1} is a matrix, not a scalar, and only scalars are expanded automatically.

=INDEX({1;3;5|7;9;10};{2;2;2|1;1;1};0;1) should work. It does for me. I am not a developer and have not seen any formal description of array processing in Calc. However I have used it enough to have an intuitive feeling for how it works, or doesn't work, in many situations. I try to avoid it because it can be very difficult to understand, and because there are some acknowledged bugs. But I realize that there are cases when it is inconvenient to perform calculations any other way.

If this answered your question please go to your first post use the Edit button and add [Solved] to the start of the title. You can select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.7.8, iMac Intel.   The locale for any menus or Calc formulas in my posts is English (USA).
Jim DeLaHunt
Posts: 4
Joined: Tue Oct 14, 2014 4:31 am

Re: What is correct INDEX() behaviour in array context?

Post by Jim DeLaHunt »

Thanks, MrProgrammer, for taking a look at my situation.

Your intuitive explanation about how Calc "expands" scalar and zero arguments is appealing. Sometimes, it matches the behaviour I see from Calc. Unfortunately, sometimes it doesn't.

For instance, my example =INDEX({1;3;5|7;9;10};{2|1};0;1) give this result:

Code: Select all

7	#VALUE!	#VALUE!
1	#VALUE!	#VALUE!
Your suggestion is that my example "would be expanded to =INDEX({1;3;5|7;9;10};{2|1};{1;2;3|1;2;3};{1;1;1|1;1;1})". You expect that "This fails because {2|1} has only one column and can’t supply indices for columns 2 and 3 of the result." But for me it doesn't fail, nor does it give me the same values as my example, nor does it give me values I think are intuitively correct. I get:

Code: Select all

7	3	5
1	9	10
(It isn't intuitively correct to me that the first column of the result draws from different rows than columns 2 and 3 of the result.)

Another person pointed me to the OpenDocument specification, v1.2, Part 2, which is apparently the authority on how Calc should behave. I read the sections 6.14.6 INDEX and 3.3 Non-Scalar Evaluation (aka 'Array expressions'). They are the sort of specification I was looking for.

My first case, with scalar arguments and a "0" parameter for column, seems to be covered by section 2.2.1:
2.2.1) Functions returning arrays are not eligible for implicit iteration. When evaluated in 'matrix' mode the {0;0}th element is used.
In other words, I think this specification says that my example gets interpreted as =INDEX({1;3;5|7;9;10};{2|1};1;1) (note column of 1, instead of 0). However, the result I see is:

Code: Select all

7	7	7
1	1	1
(Note that columns 2 and 3 do not have the #VALUE! error, instead they duplicate column 1.)
--Jim DeLaHunt http://jdlh.com (OpenOffice 4.0.1 on Mac OS X 10.6.8)
Jim DeLaHunt
Posts: 4
Joined: Tue Oct 14, 2014 4:31 am

Re: What is correct INDEX() behaviour in array context?

Post by Jim DeLaHunt »

Thanks to Lupp for pointing me to the OpenDocument specification, v1.2, Part 2, which is apparently the authority on how Calc should behave. This is the sort of specification for which I was looking.

My case, with some array values for normally scalar arguments, and a "0" parameter for column, seems to be covered by 3.3 Non-Scalar Evaluation (aka 'Array expressions'), section 2.2.1:
2.2.1) Functions returning arrays are not eligible for implicit iteration. When evaluated in 'matrix' mode the {0;0}th element is used.
We can quibble about what section 3.3 says about how Calc should behave, and how this differs from the way Calc 4.0.1 actually behaves. But the behaviour I was expecting was that functions returning arrays would be eligible for implicit iteration. The spec isn't giving me that.

In that case, I think the specification for 6.14.6 INDEX, and the documentation wiki for INDEX, and the OpenOffice Help for INDEX, are lacking. They should have some mention that a parameter value of 0 is not permitted in some circumstances. And, documentation for any other "function returning arrays" should have a similar mention.

I would like to see a note like: Note: when this function is used in an array formula context, and when one of the row, column, or area parameters are arrays instead of scalar values, then neither the row nor the column parameter is permitted to have the value 0. The function will not return the expected array in this situation. And a similar note should be in the specification and documentation for any other function returning arrays. But I will leave that for a separate discussion thread.
--Jim DeLaHunt http://jdlh.com (OpenOffice 4.0.1 on Mac OS X 10.6.8)
Post Reply