Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flattening Tuples, finding Variables by name, and converting Time Tuples #115

Merged
merged 72 commits into from
Aug 28, 2020
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
86bc418
Start a TimeTupleToTime op with a test dataset
sapols Jul 21, 2020
ceba70a
Typo fix
sapols Jul 21, 2020
6167a76
Start adding better support for tuple dataTypes
sapols Jul 23, 2020
48fbf3d
Hack to support preserving a name for flattened Tuples
sapols Jul 23, 2020
4785239
Start porting over code from v2
sapols Jul 23, 2020
4bdb686
Update schema URL
sapols Jul 24, 2020
1208888
Correct prepending of Tuple IDs
sapols Jul 24, 2020
826059a
Start unit testing Tuple flattening
sapols Jul 24, 2020
e3acd0a
Add a test
sapols Jul 24, 2020
5735070
Revent FDML back to how it should be
sapols Jul 24, 2020
3b3defe
Remove unused function
sapols Jul 24, 2020
98dd45b
Start trying to figure out Tuples in getPath
sapols Jul 24, 2020
d8b4f40
Refactor flatten and add a test
sapols Jul 28, 2020
f13984c
Remove newline
sapols Jul 28, 2020
bda9a92
Add TODO
sapols Jul 28, 2020
d61fe0e
Remove TODO
sapols Jul 28, 2020
b346aab
Add a comment
sapols Jul 28, 2020
ba33ab8
Remove unused fdml
sapols Jul 28, 2020
87c7a85
Remove test
sapols Jul 28, 2020
9dfd4c0
Remove comment
sapols Jul 28, 2020
f50ba54
Remove changes
sapols Jul 28, 2020
5ef2935
Add newline
sapols Jul 28, 2020
e1c61bd
Start building Time Scalar for model
sapols Jul 28, 2020
15ce30a
Write and test hardcoded version of operation
sapols Jul 29, 2020
dcdbc06
Add a TODO
sapols Jul 30, 2020
16b40da
Replace a pattern match with getOrElse
sapols Jul 30, 2020
c1d84ab
Replace text data and test FDML with a mock dataset in code
sapols Jul 30, 2020
9bb3fa6
Replace .map().getOrElse() with traverse and fold
sapols Jul 30, 2020
3e570c8
Replace "Scalar" with "Time"
sapols Jul 30, 2020
57b5890
Beef up test assertions
sapols Jul 30, 2020
05d652d
Use LatisException instead of RuntimeException
sapols Jul 30, 2020
e9ea660
Handle and test case where flattening nested Tuples without any IDs
sapols Jul 30, 2020
93a1ec0
Align pluses
sapols Aug 4, 2020
914e0bd
Add a TODO
sapols Aug 4, 2020
6eca9ca
Return position of first Scalar in Tuple when searching Tuple ID
sapols Aug 5, 2020
084226f
Replace TODOs with an explanatory comment
sapols Aug 5, 2020
f7b3896
Add commented-out code that finds position and length of Time Tuple
sapols Aug 5, 2020
adb53c2
Alter commented-out code and add a TODO
sapols Aug 5, 2020
8b9135b
Frst real implementation of applyToData
sapols Aug 6, 2020
76aa7af
Alter TODO comment
sapols Aug 6, 2020
d2ef547
Extend MapOperation instead of UnaryOperation
sapols Aug 6, 2020
67bd76c
Unit test new dataType.getPath behavior
sapols Aug 6, 2020
0437435
Recurse to replace time Tuple instead of assuming it's the domain
sapols Aug 7, 2020
e0ed484
Remove unneeded arg
sapols Aug 7, 2020
c3acb0f
Remove intermediate val
sapols Aug 7, 2020
e815266
Replace time Tuple with improved model.map and add a unit test
sapols Aug 7, 2020
85eff89
Fix test by expecting the dot added by flatten()
sapols Aug 9, 2020
2103e76
Refactor flatten and its unit tests assuming all nested Tuples have IDs
sapols Aug 18, 2020
b54c9cc
Remove code and tests for generating "_#" IDs
sapols Aug 18, 2020
560da2f
Revert GroupByVariable test now that flatten isn't prepending a dot
sapols Aug 18, 2020
9ee7f76
Add a TODO
sapols Aug 18, 2020
afb3f25
Support searching fully qualified IDs (e.g. "foo.bar.a") in getPath
sapols Aug 19, 2020
d119980
Update a TODO
sapols Aug 19, 2020
3c3b034
Don't use intermediate val
sapols Aug 21, 2020
ebc368f
Remove unneeded curlies
sapols Aug 21, 2020
dab039a
Guard against calling .head on empty list, improve error msgs
sapols Aug 21, 2020
e6020a5
Remove unneeded assertions about mockDataset's contents
sapols Aug 21, 2020
c4bba7a
Reduce code duplication with a helper function
sapols Aug 21, 2020
0ba05be
Improve names of unit tests for TimeTupleToTime
sapols Aug 21, 2020
64040c0
Combine indistinguishable cases in flatten's pattern match
sapols Aug 21, 2020
221b656
Add a comment to better explain Tuple ID logic in flatten
sapols Aug 21, 2020
d47fd09
Throw an error if time tuple is in a nested function
sapols Aug 21, 2020
e4929d4
Preserve time tuple metadata
sapols Aug 25, 2020
106e505
Typo fix: nested "Tuples", not nested "Functions"
sapols Aug 25, 2020
1748012
Use t.id.nonEmpty instead of !t.id.isEmpty
sapols Aug 25, 2020
06170eb
Simplify applyToModel by mapping to find time Tuple
sapols Aug 25, 2020
96b2deb
Change test Scalar types from int to string
sapols Aug 25, 2020
f52d4dd
Remove unused line of code
sapols Aug 25, 2020
9623da5
Move data slicing logic into helper function
sapols Aug 26, 2020
95475e4
Refactor findVariable to use findAllVariables.head
sapols Aug 26, 2020
3d90a71
Use findVariable instead of findAllVariables
sapols Aug 26, 2020
b20d39c
Replace pattern match in findVariable with .headOption
sapols Aug 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 46 additions & 8 deletions core/src/main/scala/latis/model/dataType.scala
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,19 @@ import latis.metadata._
import latis.util.DefaultDatumOrdering
import latis.util.LatisException

import scala.collection.mutable.ArrayBuffer

/**
* Define the algebraic data type for the LaTiS implementation of the
* Functional Data Model.
*/
sealed trait DataType extends MetadataLike with Serializable {

// Apply f to Scalars only, for now
/** Recursively apply f to this DataType. */
def map(f: DataType => DataType): DataType = this match {
case s: Scalar => f(s)
case Tuple(es @ _*) => Tuple(es.map(f))
case Function(d, r) => Function(d.map(f), r.map(f))
case t @ Tuple(es @ _*) => f(Tuple(t.metadata, es.map(f)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we can always include metadata munging as part of the function f. This simply carries the original metadata through so it can do so. That makes me feel better about DataType.map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! I was pretty convinced I nailed the logic here. It's pretty simple actually.

case func @ Function(d, r) => f(Function(func.metadata, d.map(f), r.map(f)))
}

/**
Expand Down Expand Up @@ -44,6 +46,31 @@ sealed trait DataType extends MetadataLike with Serializable {
def findVariable(variableName: String): Option[DataType] =
getVariable(variableName)
//TODO: support aliases
//TODO: could do findAllVariables(variableName).head like in v2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards doing this: have findVariable delegate to findAllVariables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that then.


/**
* Find all Variables within this Variable by the given name.
* TODO: support aliases
*/
def findAllVariables(variableName: String): Seq[DataType] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit uneasy about this approach from v2 since it implies that there can be multiple variables with the same name. Maybe this is OK in the context of name (as opposed to id) given aliases. Though, ambiguous aliases smells like a potential problem. The idea is that a user could use an alias in a selection. Should it apply to all variables with that alias? Probably not.

This is relevant in v2 where any variable of type Time gets the alias time. In that case we apply the selection to the first (presumably the primary) time variable. The idea here is that we can use standard names for standard variables like time, wavelength, irradiance....

We also add the original variable name to the alias list when it is renamed.

These uses of alias are convenient, except when they aren't. Mathematical rigor would suggest that we never have this ambiguity problem, but aliases are convenient. Hmm. Maybe we better punt on aliases for now and hide this from the public API.

variableName.split('.') match {
case Array(_) =>
val vbuf = ArrayBuffer[DataType]()
if (this.id == variableName) vbuf += this //TODO: use hasName to cover aliases?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of these days the id method (from MetadataLike) should return an Identifier. I suspect we could use hasName instead of checking for equality like this. (#130 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that change is coming. I'm sure you're right, in theory, but as far as I'm aware, hasName doesn't actually exist in v3 yet. Lol

this match {
case _: Scalar =>
case Tuple(es @ _*) =>
vbuf ++= es.flatMap(_.findAllVariables(variableName))
case Function(d, r) =>
vbuf ++= d.findAllVariables(variableName)
vbuf ++= r.findAllVariables(variableName)
}
vbuf.toSeq
case Array(n1, n2 @ _*) => {
findAllVariables(n1).flatMap(_.findAllVariables(n2.mkString(".")))
}
}
}

/**
* Return the function arity of this DataType.
Expand Down Expand Up @@ -73,34 +100,45 @@ sealed trait DataType extends MetadataLike with Serializable {
* Return this DataType with all nested Tuples flattened to a single Tuple.
* A Scalar will remain a Scalar.
* This form is consistent with Samples which don't preserve nested Functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just noticing this comment. I know this line isn't part of the PR, but flatten does preserve nested Functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should say "... Samples which don't preserve nested Tuples"

* Flattened Tuples retain the ID of the outermost Tuple.
*/
def flatten: DataType = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question that's more for @dlindhol and @lindholc . flatten on List will only flatten one layer deep, so something like List[List[List[Int]]] will flatten to List[List[Int]]. It looks like this implementation on DataType will flatten to have no nested tuples. Do we want to be consistent with List? I'm worried the name is misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point well taken, but I think the point is to eliminate nested tuples because (1) they cause problems and (2) they aren't preserved on the Data side anyway. So that's definitely what the method should do; maybe the question is more if "flatten" is an appropriate name or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree flattening everything is more useful and it's what we want here. I wouldn't dare suggest a new name for it because that's hard to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eew. I didn't realize that List.flatten didn't recurse. How about something like flattenAll? Surely there is some precedence out there.

var tupIds = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I try to wrap my head around the current logic, it smells like this tupIds should be part of the recursive accumulation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I mentioned in the standup, making tupIds part of the recursion was my first idea but I couldn't get it to work since we recurse with two methods here: go and flatten itself. The current logic with the var is the best I got.

// Recursive helper function that uses an accumulator (acc)
// to accumulate types while in the context of a Tuple
// while the recursion results build up the final type.
def go(dt: DataType, acc: Seq[DataType]): Seq[DataType] = dt match {
case s: Scalar => acc :+ s
case Tuple(es @ _*) => es.flatMap(e => acc ++ go(e, Seq()))
case Function(d, r) => acc :+ Function(d.flatten, r.flatten)
//prepend Tuple ID(s) with dot(s) and drop leading dot(s)
case s: Scalar => acc :+ s.rename(s"$tupIds.${s.id}".replaceFirst("^\\.+", ""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the replaceFirst might it be clearer if we do something like if (tupIds.nonEmpty) s.rename... else s?

Copy link
Contributor Author

@sapols sapols Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code did something similar to that at one point, but there is one edge case where the if (tupIds.nonEmpty) ... else s logic doesn't work. If I could've found a cleaner way to do this, I would've done it.

case Function(d, r) => acc :+ Function(d.flatten, r.flatten)
case t @ Tuple(es @ _*) => if (tupIds.isEmpty && !t.id.isEmpty) tupIds = t.id else tupIds += s".${t.id}"
es.flatMap(e => acc ++ go(e, Seq()))
}

val types = go(this, Seq())
types.length match {
case 1 => types.head
case _ => Tuple(types)
case _ =>
if (tupIds.split('.').isEmpty) Tuple(types)
else Tuple(Metadata(tupIds.split('.').head), types)
}
}

/**
* Return the path within this DataType to a given variable ID
* as a sequence of SamplePositions.
* Note that length of the path reflects the number of nested Functions.
* When searching a Tuple's ID, the path to the first Scalar in the Tuple is returned.
*/
def getPath(id: String): Option[SamplePath] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about the changes you made but about this function in general. It looks like this is intended to be called on a Function and won't work if called on a Scalar or Tuple. Maybe the only use case is to call it on a Function, but it seems like a bug for it to be defined for any DataType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where the idea of a ConstantFunction can come to the rescue. Scalar or Tuple can be thought of a Functions of arity zero. A Sample of such would have an empty domain with the data in the range. So, a SamplePosition for a Scalar or Tuple would necessarily be a RangePosition.


// Recursive function to try paths until it finds a match
def go(dt: DataType, id: String, currentPath: SamplePath): Option[SamplePath] =
if (id == dt.id) Some(currentPath) //found it //TODO: use hasName to cover aliases?
//TODO: use hasName to cover aliases?
//searching fully qualified ID with namespace
if (id.contains('.') && dt.id == id) Some(currentPath) //found it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did we decide about dots in Identifiers? I recall that they are not allowed in an Identifier but that qualified identifiers are another matter.

I also recall cringing at the thought that we need tuples of arity one to represent, for example, B.x where the tuple provides the B namespace if we can't define the scalar identifier to be B.x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conclusion for now: Stick with string identifiers with dots. We'll make a new ticket to rely exclusively on the model structure to preserve namespace and use Identifier here and elsewhere without support for dots (".") (#130 ). flatten can thus be an encapsulated implementation detail as needed to achieve some result.

I don't think there is a use case for users to flatten their datasets since it doesn't alter the data. The concept only seems useful for us to map a variable to a sample element (as here) or to capture the structural name space in a flattened view of the metadata (e.g. csv column headings). It doesn't seem worthy of the public API.

Presumably, the getPath use case could be done without needing to reconstruct a flattened representation of the model. Something logically equivalent to zipping the hierarchical structure of the model with the flattened structure of a Sample, but without the baggage of actually doing so. (#129 )

//searching variable ID without namespace
else if (dt.id.split('.').contains(id)) Some(currentPath) //found it
else
dt match { //recurse
case _: Scalar => None //dead end
Expand Down
78 changes: 78 additions & 0 deletions core/src/main/scala/latis/ops/TimeTupleToTime.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
package latis.ops

import cats.implicits._
import latis.data._
import latis.metadata.Metadata
import latis.model._
import latis.time.Time
import latis.util.LatisException

/**
* Defines an Operation that converts a Tuple of time values to a single time Scalar.
*
* @param name the name of the Tuple that stores the time values
*/
case class TimeTupleToTime(name: String = "time") extends MapOperation {

override def applyToModel(model: DataType): DataType = {
val timeTuple = model.findAllVariables(name) match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is a case where we should be returning a List instead of Seq?
Otherwise I'm surprised that matching Nil even works.
The second case could then be as Ryan suggested below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, refactor and use findVariable to do this (return the head if more that one are found) and deal with Option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have seen the TODO I added underneath dataType.findVariable:

//TODO: could do findAllVariables(variableName).head like in v2

That would replace what findVariable currently does, which is effectively:

getScalars.find(_.id == variableName)

I personally think the new approach would be better, which is why I added the TODO. I just wasn't sure if you were ready to commit to that behavior. If you are, I can make the change then refactor this code in TimeTupleToTime.

case Nil => throw new LatisException(s"Cannot find variable: $name")
case vs: Seq[DataType] => vs.head
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance vs.head is not a Tuple? You could do
case (v: Tuple) :: _ => v

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically vs.head could not be a Tuple if someone passed the name of an extant non-Tuple variable to this Operation. I've updated my error handling here to deal with that case.

}

val time: Scalar = timeTuple match {
case Tuple(es @ _*) =>
//build up format string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style question: Should this kind of comment be elevated a bit with a space and capitol letter? Maybe include the s to be consistent with method comments?
// Builds ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, personally, I like it better lowercase and without a space. Lowercase and no space is what we normally do for comments, and I don't see this comment as anything special in need of "elevating." I'd be cool with adding s to make it present tense though. But even that doesn't seem necessary because normal comments don't make it into the Scaladoc. As I recall, the only reason we add s to Scaladoc comments is so that they read consistently for users who are browsing the Scaladoc.

val format: String = es.toList.traverse(_("units"))
.fold(throw new LatisException("A time Tuple must have units defined for each element."))(_.mkString(" "))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm imagining a better multi-line expression here. Then again, folds still don't look natural to me, yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion. @lindholc actually wrote this line like this in a PR comment then I copied it into this code. I don't mind the long one-liner here.

//make the time Scalar
val metadata = Metadata("id" -> "time", "units" -> format, "type" -> "string")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we preserve the tuple's ID instead of forcing it to be "time" here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's smart. We know the time Tuple will have a name because we have to search that name to get the variable. Since someone already went through the trouble of naming the time Tuple—and since, as the default param suggests, the name of the time Tuple will usually be "time" anyway— I don't see a reason to force the ID to be "time". I'll change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Time(metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be some metadata in the original time tuple worth preserving here.

Copy link
Contributor Author

@sapols sapols Aug 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

  • Preserve original time tuple metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

case _ => throw new LatisException(s"Variable '$name' must be a Tuple.")
}

model.map {
case t: Tuple if t.id == name => time //TODO: support aliases with hasName?
case v => v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we should be able to traverse with map first. Then when we stumble upon desired time tuple, construct the time (via a function?) and replace it. Otherwise it feels rather imperative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮 I didn't think of this! That lets me remove so much code. Beautiful.

}
}

override def mapFunction(model: DataType): Sample => Sample = {
val timePos: SamplePosition = model.getPath(name) match {
case Some(List(sp)) => sp
case _ => throw new LatisException(s"Cannot find path to variable: $name")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can getPath return a list of more than one element? Maybe that would require matching on aliases. If so, we might not want to throw an exception here, or at least change the message.

Copy link
Contributor Author

@sapols sapols Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it technically can; the scaladoc for getPath says:

Return the path within this DataType to a given variable ID
as a sequence of SamplePositions.
Note that length of the path reflects the number of nested Functions.

So if the time Tuple was inside a nested Function or two, we'd get our error "Cannot find path to variable: $name". I'm not exactly sure how I'd have to alter the pattern match on timePos to handle this nested Function case, but I've also never seen a dataset that put a time Tuple in a nested Function, so I'm tempted to say that our error message here is correct... we cannot find a path to a variable in a nested function.

Maybe it'd be appropriate to add a case between these two cases that catches a List of SamplePositions longer than one, then throw an error saying we don't handle time Tuples in nested Functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be appropriate to add a case between these two cases that catches a List of SamplePositions longer than one, then throw an error saying we don't handle time Tuples in nested Functions.

case None => throw new LatisException(s"Cannot find path to variable: $name")
case _ => throw new LatisException(s"A time Tuple in a nested Function?!")

}
val timeLen: Int = model.findAllVariables(name) match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really liking the idea of refactoring findVariable to do this.

case Nil => throw new LatisException(s"Cannot find variable: $name")
case vs: Seq[DataType] => vs.head match {
case t: Tuple => t.flatten match {
case tf: Tuple => tf.elements.length //TODO: is this "dimensionality"? Should it be a first class citizen?
}
case _ => throw new LatisException(s"Variable '$name' must be a Tuple.")
}
}

(sample: Sample) => sample match {
case Sample(dd, rd) =>
//extract text values and join with space
//TODO: join with delimiter, problem when we use regex?
timePos match {
case DomainPosition(n) =>
val domain = dd.slice(0, n) ++ Seq(time(n, timeLen, dd)) ++ dd.slice(n+timeLen, dd.length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factoring out the time function was a good idea. Could we take it a step further and move the slicing logic there, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. No reason that wouldn't work.

Sample(domain, rd)
case RangePosition(n) =>
val range = rd.slice(0, n) ++ Seq(time(n, timeLen, rd)) ++ rd.slice(n+timeLen, rd.length)
Sample(dd, range)
}
}
}

/** Helper function to slice a time Datum out of a time Tuple given its position and length. */
private def time(pos: Int, len: Int, data: Seq[Data]): Datum = {
val timeData = data.slice(pos, pos+len)
Data.StringValue(
timeData.map { case d: Datum => d.asString }.mkString(" ")
)
}

}
Loading