When performing a dimensionality reduction on tensors, the trace is often zero.

post by Joseph Van Name (joseph-van-name) · 2023-08-02T21:06:55.423Z · LW · GW · 1 comments

Contents

  Remark:
  Remark:
  Conclusion: 
  Edited 1/10/2024
None
1 comment

In this post, we shall define my new dimensionality reduction for tensors in  where , and we shall make an empirical observation about the structure of the dimensionality reduction. There are various simple ways of adapting this dimensionality reduction algorithm to tensors in  and even mixed quantum states (mixed states are just positive semidefinite matrices in  which trace 1), but that will be a topic for another post.

This dimensionality reduction shall represent tensors in  as tuples of matrices . Computer experiments indicate that, in many cases, we have  whenever .

If  is a matrix, then the spectral radius of  is the value

.

If  is a matrix, then define the conjugate matrix ; this is the matrix obtained from  by replacing each entry with its complex conjugate. 

If  is a tuple of real or complex matrices, then define the -spectral radius by setting 

.

Suppose that  is either the field of real numbers or the field of complex numbers. Suppose that  is a non-commutative homogeneous of degree  polynomial with coefficients in  (it is easier to define the dimensionality reduction in terms of homogeneous non-commutative polynomials than tensors). 

Then define a fitness function  by setting

.

This function  is bounded, and it has a maximum value, but to prove that it attains its maximum value, we need to use quantum channels.

We shall call a tuple  where  is maximized an

-spectral radius dimensionality reduction (LSRDR) of the non-commutative polynomial . The motivation behind the notion of an LSRDR is that it is easier to represent the variables  as the matrices  than it is to work with the non-commutative polynomial . The -matrices  have  parameters while the non-commutative polynomial could have up to  parameters where  is the degree of the polynomial .

We observe that if  is a quadratic non-commutative homogeneous polynomial, then  where  refers to the Frobenius norm. In other words, we already have a well-developed theory of matrices, and LSRDRs do not improve the theory of matrices, but LSRDRs help us analyze tensors of order 3 in several different ways.

Given square matrices , define a completely positive superoperator by setting. The operator  is similar to the matrix .

Observation: Suppose that  is a non-commutative homogeneous polynomial of degree  with random complex coefficients. Let  be an -spectral radius dimensionality reduction of . Then we often have  whenever  is a homogeneous non-commutative homogeneous polynomial with degree  where . Furthermore, the set of eigenvalues of  is invariant under rotations by . Said differently,  whenever .

I currently do not have an adequately developed explanation for why  and  so often (more experimentation is needed), but such an explanation is probably within reach. The observation  does not occur 100 percent of the time since we get  only when the conditions are right. 

If , then

. Therefore,  precisely when  for . Furthermore, 

, so  precisely when  whenever .

. Therefore, precisely when  whenever .

Remark:

LSRDRs of tensors are well-behaved in other ways besides having trace zero. For example, if we train two LSRDRs  of a tensor multiple times with the same initialization, then we typically have  (but this does not happen 100 percent of the time either). After training, the resulting LSRDR therefore does not have any random information left over from the initialization or the training, and any random information present in an LSRDR was originally in the tensor itself.

Remark:

We have some room to modify our fitness function while still retaining the properties of LSRDRs of tensors. For example, suppose that  is a homogeneous non-commutative polynomial of degree , and define  by setting

. Then if  is a random homogeneous non-commutative complex polynomial and  and  denotes the Schatten norm ( which is the  norm of the singular values of ), and  maximizes , then (if everything works out right), we still would have  whenever .

Conclusion: 

Since LSRDRs of tensors do not leave behind any random information that is not already present in the tensors themselves, we should expect for LSRDRs to be much more interpretable than machine learning systems like neural networks that do retain much random information left over from the initialization. Since LSRDRs of tensors give us so many trace zero operators, one should consider LSRDRs of tensors as very well behaved systems, and well behaved systems should be much more interpretable than poorly behaved systems.

I look forward of using LSRDRs of tensors to interpret machine learning models and produce new highly interpretable machine learning models. I do not see LSRDRs of tensors replacing deep learning, but LSRDRs have properties that are hard to reproduce using deep learning, so I look forward to exploring the possibilities with LSRDRs of tensors. I will make more posts about LSRDRs of tensors and other objects produced with similar objective functions. 

Edits: (10/12/2023) I originally claimed that my dimensionality reduction does not work well for tensors in , but after reexperimentation, I was able to reduce random tensors in  to matrices, and such a dimensionality reduction performed well.

Edited 1/10/2024

1 comments

Comments sorted by top scores.

comment by Joseph Van Name (joseph-van-name) · 2023-08-07T09:50:12.632Z · LW(p) · GW(p)

Massively downvoting mathematics without commenting at all just shows that the people on this site are very low quality specimens who do not care at all about rationality at all but who just want to pretend to be smart.